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image  regions  corresponding  to  a  partially  occluded  object  and  to  produce  descriptions  of 
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between  occluding  and  occluded  boundaries  is  a  crucial  step  towards  determining  the  three- 
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FINAL  REPORT  -  STRUCTURE  FROM  MOTION 
AFOSR  Contract  F49620-83-0140 


a.  Objectives. 

Our  principal  objective  continues  to  be  the  development  of  a  robust  computational  approach 
for  estimating  the  spatial  organization  of  a  scene  using  time  varying  properties  of  image 
sequences.  Under  this  contract,  we  have  been  investigating  improved  methods  for  estimating 
and  interpreting  optical  flow  from  image  sequences.  Emphasis  is  placed  both  on  what  spatial 
properties  should  be  computed  and  on  appropriate  computational  architectures  for  accom¬ 
plishing  this  task. 

Three  related  questions  have  been  investigated  in  this  project. 

Estimating  optical  flow. 

What  sorts  of  errors  are  intrinsic  to  spatial-temporal  gradient  techniques  for  estimating 
optical  flow?  The  principal  objective  of  this  aspect  of  the  work  is  to  develop  a  priori 
estimates  of  expected  error  based  on  the  nature  of  the  actual  imagery,  and  a  posteriori 
error  estimates  as  an  integral  aspect  of  flow  estimation.  In  addition,  the  research  effort 
has  focused  on  how'  flow  estimation  can  be  improved  based  on  an  understanding  of  the 
nature  and  magnitude  of  the  errors  that  are  likely  to  arise. 

Interpreting  optical  flow  at  object  boundaries. 

How  can  the  analysis  of  optical  flow  be  used  to  detect  object  boundaries?  How  can  the 
three-dimensional  structure  of  object  boundaries  be  determined  based  on  optical  flow? 
The  principal  objective  here  is  to  work  towards  the  development  of  motion-based  seg¬ 
mentation  techniques  for  image  understanding.  Motion-based  segmentation  has  the 
potential  not  only  for  locating  object  boundaries,  but  also  for  reducing  problems  due  to 
occlusion  and  for  providing  three-dimensional  information  useful  for  object 
identification  and  analysis. 

Robust  methods  for  determining  object  motion. 

How  can  the  motion  of  object  relative  to  the  camera  be  determined  in  a  robust 
manner?  The  objective  is  to  categorize  the  possible  motions  into  a  limited  number  of 
meaningful  classes  and  to  develop  methods  for  recognizing  instances  of  each  class. 


b.  Status  of  research  effort. 

Estimating  optical  flow. 

We  h  ave  shown  that  a  major  difficulty  with  gradient-based  methods  is  their  sensitivity 
to  a  number  of  conditions  commonly  encountered  in  real  imagery.  Highly  textured 
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surfaces,  motion  boundaries,  and  depth  discontinuities  can  all  be  troublesome  for 
gradient-based  methods.  Fortunately,  these  problematic  areas  can  be  identified  in  the 
image.  As  a  part  of  this  contract,  we  examined  the  conditions  that  lead  to  errors, 
methods  to  reduce  errors,  and  the  estimation  of  measurement  errors  for  one  class  of 
gradient-based  techniques.  By  understanding  how  errors  arise  we  are  able  to  define  the 
inherent  limitations  of  the  gradient-based  technique,  obtain  estimates  of  the  accuracy 
of  computed  values,  enhance  the  performance  of  the  technique,  and  demonstrate  the 
informative  value  of  some  types  of  errors. 

This  part  of  the  project  has  now  been  completed. 

Interpreting  optical  flow  at  object  boundaries. 

Significant  results  have  been  achieved  on  the  problems  associated  with  motion-based 
segmentation.  Discontinuities  in  optical  flow  are  necessarily  due  to  surface  boundaries 
or  discontinuities  in  depth  in  the  scene.  Thus,  detected  edges  in  flow  necessarily 
correspond  to  important  properties  of  scene  geometry,  where  as  edges  in  properties  such 
as  luminance  can  be  due  to  a  wide  variety  of  scene  properties.  Our  approach  is  based 
on  understanding  the  three-dimensional  scene  structure  leading  to  an  edge  in  optical 
flow.  As  a  result,  we  can  simultaneously  detect  edges  and  determine  important  three- 
dimensional  properties  of  the  associated  scene  surfaces. 

Motion-based  segmentation  can  not  only  find  boundaries  that  are  difficult  to  locate  in  a 
single  view,  but  it  can  also  provide  much  more  information  about  the  structure  of  the 
scene.  Our  approach  makes  it  possible  to  distinguish  between  occluding  and  occluded 
surfaces  at  a  boundary.  Occlusion  boundaries  arise  due  to  geometric  properties  of  the 
occluding  surface,  not  the  occluded  surface.  Thus,  while  the  shape  of  the  edge  provides 
significant  information  on  the  structure  of  the  occluding  surface,  it  says  little  or  nothing 
about  the  structure  of  the  surface  being  occluded.  This  technique  may  make  it  possible 
to  link  image  regions  corresponding  to  a  partially  occluded  object  and  to  produce 
descriptions  of  object  boundaries  that  are  less  affected  by  occlusion.  *n  addition,  being 
able  to  distinguish  between  occluding  and  occluded  boundaries  is  a  crucial  step  towards 
determining  the  three-dimension  position  of  surfaces. 

Work  is  continuing  on  exploiting  these  results  in  a  variety  of  image  understanding 
tasks. 

Robust  methods  for  determining  object  motion. 

Object  motion  can  be  classified  based  on  optical  flow  into  categories  that  are  significant 
for  further  interpretation.  In  our  investigations,  object  motion  was  divided  into  four 
classes:  two  types  of  translation  and  two  types  or  rotation.  Complex  motions  can  be 
described  as  combinations  of  these  types.  The  descriptions  are  qualitative,  characteriz¬ 
ing  the  motion  in  terms  of  broad  classes  but  not  providing  precise,  qualitative  informa¬ 
tion  about  trajectories.  We  have  shown  that  under  some  circumstances,  the  categories 
are  detectable  using  simple  differential  operations  on  the  optical  flow  field.  Appropriate 
combinations  of  detectors  can  be  used  to  signal  motions  likely  to  lead  to  a  collision 
between  the  sensor  and  an  object  in  the  field  of  view.  By  structuring  the  technique  as  a 
classification  operation  involving  only  a  limited  number  of  classes,  the  noise  sensitivity 
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of  differential  operators  can  be  reduced.  For  the  situations  in  which  the  technique  is 
applicable,  it  is  tolerant  of  noisy,  sparse  flow  fields  and  requires  little  information  about 
camera  models,  motion  constraints,  or  possible  objects. 

As  a  result  of  our  research  efforts,  we  discovered  that  the  assumptions  required  to  util¬ 
ize  this  approach  are  not  sufficiently  realistic.  We  are  currently  pursuing  alternate 
approaches  to  the  determination  of  object  motion. 
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ABSTRACT 

Multiple  views  of  a  scene  can  provide  important  information  about  the  structure 
and  dynamic  behavior  of  three-dimensional  objects.  To  recover  this  information,  it  is 
necessary  to  estimate  optical  fiou  —  the  velocity,  on  the  image,  of  visible  points  on 
object  surfaces.  One  approach  for  estimating  optical  flow  is  based  on  the  relationship 
between  the  gradients  of  image  brightness  While  gradient-based  methods  have  been 
widely  studied,  little  attention  has  been  paid  to  accuracy  and  reliability  of  the 
approach 

We  examine  the  sources  of  errors  in  estimates  derived  from  gradient-based  tech¬ 
niques.  By  understanding  how  errors  arise,  we  are  able  to  define  the  inherent  limita¬ 
tions  of  the  technique,  obtain  estimates  of  the  accuracy  of  computed  values,  enhance 
the  performance  of  the  technique,  and  demonstrate  the  informative  value  of  some 
types  of  errors. 

1.  Introduction. 

The  velocity  field  that  represents  the  motion  of  object  points  across  an  image  is  called 
the  optical  flow  field.  Optical  flow  results  from  relative  motion  between  a  camera  and 
objects  in  the  scene.  Methods  which  estimate  optical  flow  lie  within  two  general  classes 
Gradient-based  approaches  utilize  a  relationship  between  the  motion  of  surfaces  and  the 
derivatives  of  image  brightness  (l,  2,  3,  4,  5,  6,  7,  8,  9,  lOj.  Matching  techniques  locate  and 
track  small,  identifiable  regions  of  the  image  over  time. 

For  many  problems  gradient-based  methods  offer  significant  advantages  over  matching 
techniques.  Matching  techniques  are  highly  sensitive  to  ambiguity  among  the  structures  to 
be  matched  Optica!  flow  can  be  accurately  estimated  for  only  highly  distinguishable  regions 
This  means  that  flow  can  only  be  determined  at  a  sparse  sampling  of  points  across  the  image 
Furthermore,  it  is  computationally  impractical  to  estimate  matches  for  a  large  number  of 
points.  The  gradient-based  approach  allows  optical  flow  to  be  simply  computed  at  a  more 
dense  sampling  of  points  than  can  be  obtained  with  matching  methods. 

Thu  work  war  supported  by  the  Air  Force  Office  of  Scientific  Research  contract  F49620-S3-0J40 
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Gradient- based  tec Lriiqur^  avoio  tn<  dtfhrult  tasi  of  finding  distinpuis! . a b •  i*  repini.s  <-■- 
point;-  of  interest  Tiu  gradient  approach  ieaa;-  to  algorithm1-  wlicI.  art  characterize-;,  h ■> 
simple  computations  localized  to  small  regions  of  the  imagt  Tdest  techniques  can  l-t  applied 
o\er  the  entire  image  As  we  shall  see  in  the  analysis  tbai  follows,  the  gradient  technique  i- 
also  sensitive  to  ambiguous  areas  -  it  is  impossible  to  localh  determine  the  motion  of  a 
homogeneous  region  However,  gradient-based  estimates  are  typically  available  over  a 
greater  area  than  those  obtained  reliably  by  matching  In  addition,  the  loss  of  precision  for 
gradient-based  estimates  in  ambiguous  areas  can  be  quantified  Accuracy  measurements  can 
be  used  to  weight  the  contribution  of  motion  estimates  in  further  analysis  or  to  filter  poor 
estimates  from  the  flow  field  These  accuracy  measurements  can  be  obtained  as  a  by-product 
of  the  flow  estimation  process  and  require  little  additional  computation 

While  gradient-based  methods  have  been  widely  studied,  little  attention  has  been  paid 
to  accuracy  and  reliability  of  the  approach.  A  major  difficulty  with  gradieDt-based  methods 
is  their  sensitivity  to  conditions  commonly  encountered  id  real  imagery.  Highly  textured  sur¬ 
faces,  motion  boundaries,  and  depth  discontinuities  can  all  be  troublesome  for  gradient- based 
methods  Fortunately,  these  problematic  areas  can  be  identihed  in  the  image  In  this  paper 
we  examine  the  conditions  that  lead  to  errors,  methods  to  reduce  errors,  and  the  estimation 
of  measurement  errors  for  one  class  of  gradient- based  techniques  Bv  understanding  how 
errors  arise  we  are  able  to  define  the  inherent  limitations  of  the  gradieDt-based  technique, 
obtain  estimates  of  the  accuracy  of  computed  values,  enhance  the  performance  of  the  tech¬ 
nique,  and  demonstrate  the  informative  value  of  some  types  of  errors 


2.  The  Gradient  Constraint  Equation. 

The  gradient  constraint  equation  relates  optical  flow  —  velocity  on  the  image  (i  ,t  )  — 
and  the  image  brightness  function  J(t,y,t)  The  common  assumption  of  gradient-based 
techniques  is  that  the  observed  brightness  -  intensity  on  the  image  plane  —  of  any  object 
point  is  constant  over  time.  Consequently,  any  change  in  intensity  at  a  point  on  the  image 
must  be  due  to  motion  Relative  motion  between  the  object  and  camera  will  cause  the  posi¬ 
tion  of  a  point  located  at  point  (a-  )  at  time  t  to  change  position  on  the  image  over  a  time 

interval  it .  By  the  constant  brightness  assumption,  the  intensity  of  the  object  point  will  be 
the  same  in  images  sampled  at  times  t  and  t+6l.  The  constant  brightness  assumption  can 
be  formally  stated  as 

Hr  ,y  ,/)  =  J(z+bx  ,y+6y  ,t  4*t )  (1) 

Expanding  the  image  brightness  function  in  a  Taylor’s  series  around  the  point  (r,y,t) 
w  e  obtain 

1  (t  +ki  ,y  +£y  ,  f  -I -ft )  —  I(r  ,y  ,t)  +  i>r  4  4—  t>y  4  44  £<  4  h  o.t.  (2) 

ci  dy  tit 

A  series  of  simple  operations  leads  to  the  gradient  constraint  equation: 


3.  Gradient  Based  Algorithms. 

The  gradieEt  constraint  equation  does  not  by  itself  provide  a  means  for  calculating  opti¬ 
ca!  flow.  The  equation  only  constrains  the  values  of  *  and  «  to  lie  on  a  line  when  plotted  in 
flow  coordinates 

The  gradient  constraint  is  usualh  coupled  with  an  assumption  that  nearby  points  move 
in  a  like  manner  to  arrive  at  algorithms  which  solve  for  optical  flow.  Groups  of  neighboring 
constraint  equations  are  used  to  collectively  constrain  the  optical  flow  at  a  pixel  Constraint 
lines  are  combined  in  one  of  three  ways  Methods  of  local  optimization  |5,  6,  7,  8.  10.  solve  a 
set  of  constraint  lines  from  a  small  neighborhood  as  a  system  of  equations.  Globa!  optimiza¬ 
tion  (11,3.9  techniques  minimize  an  error  function  based  upon  the  gradient  constraint  and 
an  assumption  of  local  smoothness  of  optical  flow  variations  over  the  entire  image  The  r/us- 
iermg  approach  jl.  2  operates  global!),  looking  for  groups  of  constraint  lines  with  coinciding 
points  of  intersection  in  flow  space. 

We  will  examine  the  local  optimization  technique  in  detail  Although  we  will  not 
directly  address  clustering  and  global  optimization,  maDy  of  the  conclusions  reached  here  also 
apply  to  these  approaches  Another  paper  examines  some  implications  of  this  analysis  for 
global  optimization  methods  jl2j. 

4.  Local  Optimization. 

The  method  of  local  optimization  estimates  optical  flow  by  solving  a  group  of  gradient 
constraint  lines  obtained  from  a  small  region  of  the  image  as  a  system  of  imear  equations 
Two  constraint  lines  are  sufficient  to  arrive  at  a  unique  solution  for  (v  ,v).  More  than  two 
equations  may  be  included  in  the  system  to  reduce  the  effects  of  errors  in  the  constraint  lines 
The  solution  to  the  over-determined  system  may  be  found  by  any  of  a  number  of  error 
minimization  techniques 

We  will  examine  errors  in  the  solution  of  two  equation  systems.  In  practice  one  should 
solve  an  over-determiDed  system  by  some  method  of  best  fit,  such  as  least  squares.  The 
analysis  presented  here  is  extended  to  over-determined  systems  in  |l3r 

The  pair  of  equations  which  we  will  solve  to  estimate  optical  flow  at  pomt 
P,  =  0.  ,y.  ,t.  )  L' 

(,) 

O')  l,vl*  +  /,0)t  =  -lt0) 


(4) 


1 


wfce  r<  the  gradient  7,  .7,  .  and  1-  11.  equations  «  and  j  ar«  evaluated  at  p,  and  a  rea-'t  ' 
pom;  p. 

Ttu  gradient*-  in  the  system  (4  i  art  estimated  from  discrete  images  and  will  b*.  hi  accu¬ 
ral*  dut  to  noise  m  the  imaging  process  and  sampling  measurement  error  Also,  the  value. 
of  (t  ,t  |  at  p,  and  p;  are  assumed  to  be  ibe  same  Tbe  formulation  will  be  mcorrert  to  the 
extent  that  optical  flow  differs  between  the  two  points  We  will  examine  how  gradient  esti¬ 
mation  error  and  error  resulting  from  non-constant  optica!  flow  leads  to  errors  in  the 
estimated  flow  vector 


4.1.  Gradient  Measurement  Error. 

The  estimates  of  the  intensity  gradients  1, ,  1, ,  and  7,  will  be  corrupted  by  errors  in  the 
brightness  estimates  and  inaccuracies  introduced  by  sampling  the  brightness  function 
discretely  in  time  and  space  The  error  in  the  brightness  function  is  random  and  results  from 
a  variety  of  sources  6ucb  as  channel  noise  and  quantization  of  brightness  levels  We  assume 
that  the  brightness  error  is  approximately  additive  and  independent  among  neighboring  pix¬ 
els  The  gradient,  estimated  from  changes  in  tbe  brightness  estimates,  will  contain  a  com¬ 
ponent  of  random  error  which  is  distributed  like  the  error  in  tbe  brightness  function.  The 
random  component  of  tbe  gradient  error  will  be  additive  and  independent  of  the  magnitude 
of  the  gradient  to  the  extent  that  the  brightness  noise  is  additive 

The  brightness  function  is  sampled  discretely  in  time  and  space  and  this  will  introduce 
a  systematic  measurement  error  into  the  estimates  7,,/,,  and  7,  of  the  gradients  The  gra¬ 
dient  sampling  error  depends  on  the  second  and  higher  derivatives  of  the  brightness  function 
To  examine  the  sampling  error  in  7,  we  expand  the  brightness  function  evaluated  at 
(j-  -+A?  ,y  ,t )  around  the  point  (x  ,y  ,t )  producing 

I (x  +Ai  ,y  ,l )  —  J(*  ,y  ,t )  +  I,  Ax  -f  /„  Ai2  +  h  o  t  (5) 


w  here  7, ,  7„  are  the  partial  derivatives  of  brightness  in  the  x  direction  evaluated  at  (x  ,y  ,t ) 
Rearranging  terms  we  obtain  aD  estimate  for  the  brightness  gradient  in  tbe  x  direction. 
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4-  1„  At  4-  h  o  t 


(6) 


The  error  t/  i*>  defined  as  7,-7,,  the  difference  between  the  computed  and  true 

values.  From  (6).  we  obtain  the  approximate  relationship 

( /t (timp/ntf )  4i  Ar  .  (/) 


Likew  ise,  the  sampling  error  in  the  estimates  of  7,  and  7,  are  approximately  given  by 

*  t,  )  ~  Iff  Ay  (8) 

*  /,  )  ‘  -  A,  At  (9) 


The  sampling  error  for  the  spatial  gradients  depends  upon  tbe  spatial  resolution  of  the  cam¬ 
era,  Ax  and  Ay  ,  and  the  second  spatial  derivatives  of  the  brightness  function,  7„  ,  ln  .  The 
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sampling  errc>-  for  ttn  temporal  gradient  is  influenced  by  the  frame  rat-  At  at.:, 

ti  e  bigner  oraer  derivatives  o'  the  brightness  functioi.  over  tun 

Wt  car  expres.-  >  purely  in  terms  of  6paual  derivatives  and  motion 

PiCerentiatitig  the  gradient  constraint  equation  (3j  with  respect  to  x  ,  i  .  and  t  we  obtain  tt  ■ 
following  three  equations 
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Where  the  second  derivatives  of  the  brightness  function  exist  and  are  continuous,  the  left- 
hand  sides  of  equations  (10)  and  (11)  can  be  substituted  for  and  lr  in  (12).  Collecting 
terms  we  see  that 
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(13) 


The  first  term  in  (13)  depends  upon  optical  flow  while  the  rest  of  the  left-hand  side  depends 
upon  the  derivatives  of  optical  flow  over  time  and  space.  If  optical  flow  is  approximately 
constant  in  a  small  neighborhood  and  approximately  constant  over  time  at  each  poiDt  on  the 
image  then 


(14) 


Note  the  similarity  between  (6)  and  (14).  We  have  derived  a  constraint  equation  for  second 
derivatives  that  is  analogous  to  gradient  constraint  equation 

Without  loss  of  generality,  we  can  rotate  our  coordinate  system  so  that  the  flow  vector 
at  a  point  lies  along  the  x -axis  In  the  new  coordinate  system  we  have 
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(15) 


It  is  evident  from  (15)  that  the  magnitude  of  1„  depends  upon  nonlinearities  of  the  bright¬ 
ness  function  in  the  direction  of  motion  and  the  magnitude  of  motion. 

In  summary,  the  systematic  errors  in  the  gradients  which  make  up  the  coefficients  of  (4) 
are  given  by  (7),  (8),  and  (9)  In  general,  the  systematic  error  in  estimating  1,  is  influenced 
by  the  magnitude  of  optical  flow  and  the  derivatives  of  optical  flow  and  the  first  and  second 
spatial  derivatives  of  brightness.  When  the  axis  of  coordinate  system  is  aligned  with  motion 
and  optical  flow  is  nearly  constant  over  time  and  space  we  can  characterize  the  systematic 
error  in  the  temporal  derivative  by 
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The  estimation  scheme  we  have  been  analyzing  assumes  that  velocity  on  the  imagf 
plane  is  constant  in  some  small  neighborhood  This  will  be  true  only  for  very  special  surfaces 
and  mot.ions  When  optica!  Sow  is  not  constant  the  method  can  provide  a  good  approxima¬ 
tion  w  here  flow  varies  slowly  over  small  neighborhoods 

The  true  set  of  equations  in  (4)  should  actually  be 

t  +  7,(,)«  =  -/,<■< 

/* 0l(v  At  )  ->■  At )  =  _/(0‘) 

where  the  actual  flow  vectors  at  points  p,  and  p;  are  (t  ,t  )  and  («/  +Av  ,t  +At  ).  respectively, 
and  the  gradients  are  estimated  at  points  p,  and  p}  .  The  difference  between  the  true  solu¬ 
tion  and  our  estimate  can  be  treated  as  an  error  on  the  right-hand  side  of  (17)  by  distribut¬ 
ing  the  multiplication  on  the  left-hand  side  of  the  system  and  rearranging  terms  as 

-r  /,<•} r  =  -/,<•'> 

-r  1,U) r  = 

where 
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Thus,  the  error  caused  by  violation  of  the  constant  flow  assumption  can  be  treated  as  an 
additional  error  in  the  estimate  of  I,  . 

To  examine  the  significance  of  this  error,  we  will  consider  size  of  relative  to  I,l'\ 

But  first  we  will  convert  to  vector  notation.  Let 
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For  the  constraint  equation  at  p}  ,  we  know  from  (17)  and  (19)  that 
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||  Aw+u>  ||  cos£; 

where  is  the  angle  between  the  gradient  vector,  g,t;),  and  the  local  change  in  optical  flow. 
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Aw  anc  t  is  tii<  fcjigls  bet  weft  the  gracic.:  vector.  g .  and  t  r»*  vert<>:  Aw*  » 

lljf  relative  error  in  7,  depends  upon  the  relative  lengths  of  the  vectors  u  a nc  A-  anc 
tne  degree  to  which  each  is  magnified  by  the  spatial  gradient  In  genera!  the  orientations  o' 
the  spatial  gradient,  optica!  flow,  and  the  local  change  m  optica!  flow  will  be  independent 
So  the  spatial  gradient  will  on  the  average  magnify  the  flow  vector  in  the  same  proportion  as 
the  change  of  flow  vector  Therefore,  on  average,  we  expect  the  relative  error  in  7,  to  be 
strongly  related  to  relative  magnitudes  of  the  flow  and  cbaDge  of  flow  vectors 

In  most  scenes,  flow  will  vary  6lowly  over  most  of  the  image.  At  surface  boundaries  we 
can  expect  to  frequently  find  discontinuities  in  optical  flow  due  to  discontinuities  in  motion 
or  depth'  Here,  the  variation  in  flow  will  contribute  b  substantial  error  and  flow  estimates 
will  usually  be  quite  poor.  However,  much  of  the  image  will  consist  of  smoothly  varying  sur¬ 
faces  When  neighboring  image  points  lie  on  the  same  smooth  surface,  flow  will  generally  be 
similar  and  hence,  the  error  contributed  by  variations  in  flow  will  be  small 

\V<  will  consider  aD  example  which  allows  arbitrary  three-dimensional  translation  of  a 
planar  surface  to  demonstrate  the  important  factors  influencing  tbe  error  contributed  by 
variations  in  optical  flow.  We  consider  two  neighboring  image  points  that  lie  ol  a  surface 
translating  with  velocity  (t7,V,H’)  in  three-dimensional  space  (see  Figure  A  1  j.  Let  the  sur¬ 
face  be  defined  by  the  planar  equation 

Z(X ,Y)  =  R  -+  oA’  ■+  0)'  .  (23) 

In  appendix  A  we  derive  the  follow  ing  approximate  bound 


The  angle  is  the  angle  subtended  by  (Az  ,  Ay)  with  a  focal  length  of  /  ;  this  is  simply  the 
size  of  the  neighborhood  measured  in  degrees  of  visual  angle  The  length  of  the  change-of- 
flow  vector  relative  to  the  length  of  the  flow  vector  depends  upon  the  size  of  the  neighbor¬ 
hood,  the  slope  of  the  surface  viewed,  and  the  ratio  of  velocity  along  the  line  of  sight  to  velo¬ 
city  perpendicular  to  the  line  of  sight 

Recall  that  the  value  given  by  (24)  represents  a  rough  measure  of  the  proportion  of 
error  on  the  right-hand  side  contributed  by  variations  in  optica)  flow  If  the  neighborhood  is 
small  we  expect  random  errors  in  the  temporal  gradient  to  usually  be  larger  than  the  error 
caused  by  flow  variation.  Tbe  gradient  measurement  errors  discussed  in  tbe  last  section  may 
lead  to  much  larger  degradation.  So,  for  most  of  the  image  the  error  caused  by  variation  in 
flow  should  not  constitute  a  problem  However,  a'  surface  boundaries  optical  flow  can 
change  dramatically,  especially  when  object  motions  are  allowed  Here,  tbe  local  optimiza¬ 
tion  result  will  be  a  very  poor  measure  of  optica!  flow. 


4.3.  Ili-coDditioning 

Tlx  accuracy  or  tht  estimates  »  and  «  vili  depend  oi  t h c  measurement  errors-  n  u  < 
gradient  constraint  equations  and  the  error  propagation  characteristics  of  the  linear  system 
When  a  system,  of  linear  equations  is  very  sensitive  to  small  errors  in  the  coefficients  o: 
right-hand  side  it  is  said  to  be  ill-conditioned  If  the  spatial  intensity  gradiei  is  change  slowU 
then  the  linear  system,  will  contain  constraint  lines  that  are  nearly  parallel  As  a  conse¬ 
quence,  the  system  will  be  nearly  singular  and  small  errors  in  tbe  gradient  measurements 
may  result  in  large  changes  in  the  estimated  flow  value.  We  will  find  that  the  conditioning 
of  the  linear  system  largely  depends  upon  nonlmearities  ir.  the  brightness  function  which  arc 
perpendicular  to  the  brightness  gradient 

If  the  gradients  are  known  exactly  and  optical  fiow  is  constant  then 
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As  before,  the  rows  of  G  and  b  are  taken  from  a  point  p,  and  its  neighbor  p3  .  Tbe  vector  u- 
will  be  in  error  to  the  degree  that  the  gradient  measurements  are  inaccurate  and  optical  fiow 
varies  between  points  p,  and  p}  .  The  previous  section  showed  that  the  error  accrued  when 
*  and  t  are  not  constant  is  the  same  as  that  which  wowid  be  obtained  if  the  b  vector  is  suit¬ 
ably  modified  as  in  (18)  This  error  will  be  absorbed  on  the  right-hand  side  of  (26)  Thus, 
the  system  which  is  actually  solved  is 
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(28) 


w  here, 
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The  errors  in  the  spatial  and  temporal  gradients  arise  from  both  systematic  and  random 
measurement  errors. 


A  number  of  measures  of  conditioning  have  been  proposed  (14].  The  most  widely  used 
index  of  conditioning  is  the  condition  number,  eonrf(),  which  is  defined  as 


cond  (G)  =  ||  G  ||  ||  G1 


(30) 


for  a  matrix  of  coefficients  G  The  condition  Dumber  roughly  estimates  the  extern  to  which 
relative  errors  in  the  coefficients  and  tbe  righthand  side  are  magnified  in  the  estimate  of 
optical  flow.  For  the  problem  at  hand,  the  conditioning  of  the  matrix  G  is  determined  b> 
the  nature  of  the  spatial  brightness  function  over  the  interval  (p,  ,p3  ) 

The  inverse  of  G  can  be  directed  calculated  as 
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where  g(”  is  the  spatial  gradient  vector  at  p,  and  4  is  the  angle  between  g(,)  aD d  g(  ). 

Before  we  can  evaluate  the  condition  number  we  must  select  a  matrix  norm.  We  will 
use  the  Frobemus  norm1.  We  will  continue  to  use  the  |1  •  ||  r  norm  to  evaluate  vector  norms 
From  the  definition  of  |j  |j  r  and  the  results  above  we  have 


coni (G)  = 
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The  magnitude  of  eon<f(G)  depends  on  the  orientations  and  relative  magnitudes  of  the 
two  spatial  gradient  vectors  The  value  of  cond(G)  is  minimized  when  the  spatial  gradients 
are  perpendicular  and  have  the  same  magnitude.  As  the  spatial  gradients  become  more 
nearly  parallel  the  magnitude  of  eond{G)  is  increased,  and  hence,  error  propagation  is  wor¬ 
sened.  Increases  in  the  relative  difference  in  the  magnitudes  of  the  spatial  gradients  also 
cause  conif(G)  to  increase.  The  magnitude  of  this  effect  will  not  usually  be  important  If  the 
neither  of  the  gradients  is  very  small,  then  the  relative  sizes  of  the  gradients  will  not  differ 
enormously.  The  gradients  will  be  poorly  estimated  where  they  are  small,  so  for  multiple 
reasons  estimates  will  be  error  prone  in  these  regions. 

The  most  important  factor  determining  conditioning  is  the  angle  between  the  gradients 
Where  the  gradients  are  nearly  parallel,  conditioning  will  be  a  problem.  Thus,  if  both  points 
lie  along  a  straight  edge,  we  cannot  obtain  a  solution.  (This  is  an  example  of  the  aperture 
problem  |l  1  ]). 

Some  higher  derivatives  of  brightness  must  be  large  for  there  to  be  a  significant  change 
in  gradient  orientation  over  a  small  neighborhood.  Let  Ag  be  the  difference  Between  the  two 
gradient  vectors  We  can  expand  the  gradient  in  a  Taylorseries 
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Consequently , 
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1  The  Frobemus  norm.  ||  •  ||  f  ,  is  defined  as  the  square  root  of  the  sum  of  the  squares  of  all  the 
elements  The  Frobemus  norm  can  be  used  to  bound  the  more  familiar  j|  •  jj  2  norm  1 1 5]  It  can  be 
shown  that 
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Tin  angle  between  the  gradient*-  depends  oi  the  component  of  Ag  that  is  prrpendirula*  t 
g'  If  optica!  flow  is  to  be  accurateh  estimated  m  a  amaJ:  regioL  around  p"  ,  tneL  at  lea*: 
one  component  of  tbe  second  derivative  perpendicular  to  tbe  gradient  must  be  large  Tnere 
must  be  at  least  some  direction  in  which  we  can  select  a  neighbor  so  that  the  gradient  orien¬ 
tation  g(j  will  differ  from  g1'1. 

4.4.  Combining  tbe  Sources  of  Error 

We  now  face  a  dilemma  We  have  just  shown  that  some  component  of  tbe  second 
derivative  must  be  large  to  minimice  error  propagation  However,  we  earlier  showed  that 
sampling  errors  in  tbe  gradients  were  proportional  to  tbe  tbe  magnitude  of  the  second  deriva¬ 
tive.  There  is  a  tradeoff  between  tbe  gradient  measurement  errors  and  conditioning  The 
problem  would  not  be  too  serious  if  we  were  only  concerned  about  errors  in  the  spatial  gra¬ 
dients  If  we  let  the  sampling  interval  be  reasonably  small  with  respect  to  the  neighborhood 
from  which  we  select  our  equations,  we  can  potentially  satisfy  both  goals  —  the  gradient  can 
change  slowly  from  pixel- to- pixel  but  the  total  variation  over  the  neighborhood  can  be  large 
enough  to  allow  acceptable  conditioning 

A  serious  conflict  can  arise  in  the  tradeoff  between  conditioning  and  sampling  errors  in 
the  temporal  derivative.  Recall  that  the  systematic  measurement  error  in  7,  is  proportional 
to  nonlinearities  in  the  spatial  brightness  function  (13)  To  achieve  acceptable  conditioning 
the  spatial  brightness  function  must  be  nonlinear  in  some  direction  If  optica!  flow  is 
oriented  in  this  direction,  then  the  condition  number  and  measurement  errors  will  be 
inversely  related.  Increases  in  the  magnitude  of  the  second  spatial  derivatives  will  reduce  the 
condition  number  and  increase  the  measurement  error.  Note  that  there  need  not  be  a 
conflict;  optical  flow  can  be  perpendicular  to  direction  in  which  the  gradient  orientation  is 
varying 

The  problem  is  heightened  by  tbe  sensitivity  of  measurements  where  the  flow  vector  is 
large.  The  systematic  measurement  error  in  the  temporal  derivative  increases  as  the  square 
of  flow  magnitude  (13).  Where  flow  is  large,  even  small  nonlinearities  can  contribute 
significant  measurement  errors.  However,  where  object  points  are  stationary  or  moving 
slowly,  the  measurement  error  in  tbe  temporal  gradient  will  be  negligible  and  most  accurate 
estimates  will  be  obtained  the  gradients  are  not  small  and  vary  rapidly. 

As  an  illustration  of  the  interplay  between  tbe  concerns  of  conditioning  and  measure¬ 
ment  error,  consider  an  image  painted  with  an  isotropic  texture.  If  the  region  is  stationary 
then  a  large  amount  of  detail  will  be  desirable  to  minimize  conditioning  If  optical  flow  is 
significantly  greater  than  zero,  then  too  much  detail  will  lead  to  unacceptably  large  measure¬ 
ment  errors.  A  balance  must  be  struck  between  these  two  sources  of  error 

The  conditioning  of  G  can  be  improved  by  using  a  large  neighborhood.  The  risk  in 
choosing  neighbors  over  too  great  a  distance  is  that  tbe  error  due  to  non-constant  flow  can 
become  very  large.  If  the  neighbors  lie  on  a  single  surface  the  contribution  of  errors  due  to 
non-constant  flow  will  usually  grow  slowly  with  neighborhood  size.  But  if  neighbors  lit  on 


difitrer,;  surfaces  ther  motions  may  differ  substantially  As  tn igh borhood  si?<  n  ir.rreas< 
become*-  more  LLely  that  neighbors  -will  be  across  a  surface  boundary  and  the  different!  n 
optica!  fiow  will  lead  to  significant  errors 

Tbe  total  error  in  the  flow  estimate  is  determined  by  tbe  characteristics  of  the  optica’ 
flow  field  the  nature  of  the  brightness  function,  and  the  selection  rule  for  constructing  th* 
lmear  system  The  sources  of  error  are  summarized  in  Tablt  1 


Error  Source 

Determinan  ts 

3  Gradient  Measurement  Error 

(a)  random  (/,  ,1,  ,/<  ) 

(i) 

tt  sensor  noise 

(ii) 

tt  quantization  noise 

(b)  systematic  (/, ) 

(0 

tt  nonlinearities  in  the  brightness 
function  in  the  direction  of 
optical  flow 

(H) 

tt  optical  flow  magnitude 

2  Non-constant  Flow 

(i) 

tt  neighborhood  size 

(ii) 

tt  surface  slant 

(iii)  |t  ratio  of  velocity  along  the 
line  of  sight  to  velocity- 
perpendicular  to  the  line 
of  sight1 

3.  Ilbconditioning 

(i) 

ti  neighborhood  size 

(») 

ti  sin  of  the  angle  between  tbe 
spatial  gradient  vectors 

(ii) 

tt  relative  difference  in  the 
magnitudes  of  the  spatial 
gradient  vectors 

tt  error  increases  with  determinant 
tl  error  decreases  with  determinant 
t  for  translating  surfaces 

Table  1  The  sources  of  error  in  local  estimates  of  optical  flow  . 


These  factors  interact  in  a  complex  way  to  determine  the  accuracy  of  the  local  optimization 


Cl- 


scherrif  On).'  when  the  contribution  of  tbesr  sou rce?  o'  error  is  balanced  wjl  g  ..5  f-;in,av- 
bt  obtained 

5-  Algorithm  Extensions  Baaed  Upon  the  Error  Analysis. 

We  next  consider  how  knowledge  about  the  causes  of  errors  can  be  used  to  reduce  error 
and  introduce  techniques  to  judge  the  accuracy  of  estimates  The  improvements  in  perfor¬ 
mance  are  based  upon  parameter  selection  and  preprocessing  of  the  image  to  extract  the 
most  information  from  a  region  while  minimizing  the  intrusions  of  error.  A  method  of  itera¬ 
tive  refinement  |5,  16  is  also  described 

By -examining  the  image  sequence  for  the  conditions  which  lead  to  errors  we  can  judge 
the  accuracy  with  which  estimates  can  be  made  before  the  estimate  is  actually  made  Exami¬ 
nation  of  the  flow  estimate  itself  can  provide  additional  information  about  the  precision  of 
the  estimate  Together,  0  priori  and  0  poetcriori  estimates  of  accuracy  provide  a  useful 
heuristic  for  evaluating  the  precision  of  optical  flow  estimates 


5.1.  Error  Reduction  Technique* 

5.1.1.  Smoothing 

Blurring  the  image  will  reduce  nonlinearities  in  the  brightness  function  and  cons- 
quenth  diminish  the  systematic  error  in  the  gradient  estimates  Blurring  will  also  worsen  the 
propagation  characteristics  of  the  linear  system  causing  random  measurement  errors  and  the 
errors  due  to  non-constant  flow  to  be  magnified  Hence,  blurring  is  desirable  only  in  regions 
where  the  systematic  error  is  predominant 

As  noted  in  the  last  chapter,  the  systematic  error  in  the  gradients  depends  upon  the 
nonlinearity  of  the  brightness  function  over  the  sampling  interval.  For  the  temporal  gra¬ 
dient,  the  systematic  measurement  error  depends  upon  the  linearity  of  the  brightness  func¬ 
tion  over  the  region  which  moves  past  a  point  of  observation  on  the  image  and  the  variations 
of  optical  flow  over  time  &Dd  space.  Blurring  will  be  most  effective  in  portions  of  the  image 
which  undergo  a  significant  motion  and  contain  large  nonlmearities  in  the  brightness  func¬ 
tion  The  degree  of  blurring  should  be  sufficient  to  approximately  linearize  the  brightness 
function  over  the  region  of  translation 

The  damage  which  blurring  does  to  the  conditioning  of  the  linear  system  can  be 
counterbalanced  by  increasing  the  size  of  the  neighborhood  over  which  the  system  is  con¬ 
structed  The  risk  incurred  by  enlarging  the  area  from  which  the  constraint  equations  arc 
drawn  is  that  the  motions  of  the  points  may  differ  significantly,  as  could  happen  if  points  lied 
on  two  different  surfaces.  The  selection  of  the  radius  of  blur  and  the  neighborhood  size  must 
be  made  judiciously  so  as  to  avoid  increasing  the  error  iD  the  solution  vector 

5.1.2.  Over-determined  Systems 

Until  this  point  we  have  ignored  the  problem  of  selecting  the  direction  in  which  the 
neighbor  is  to  be  chosen  to  form  the  linear  system.  From  our  previous  discussion  of  error 
propagation  it  is  clear  that  the  choice  of  direction  can  dramatically  affect  the  error  in  the 
optical  flow  estimate  One  way  to  circumvent  the  difficulty  of  choosing  an  appropriate  direc¬ 
tion  is  to  construct  aD  over-determined  set  of  equations  from  points  in  many  directions  The 
over-determined  system  can  be  solved  by  minimizing  the  residual  over  possible  values  of  opti¬ 
cal  flow.  The  choice  of  the  norm  to  be  minimized  and  the  minimization  scheme  may  be  an 
important  determinant  of  the  error,  but  are  not  analyzed  here.  As  with  two  equation  sys¬ 
tems,  conditioning  will  be  important  for  over-determined  systems  and  conditioning  will  be 
related  to  the  same  characteristics  of  the  image  as  in  the  two  equation  cast.  Another 
approach  is  to  perform  the  analysis  separately  in  a  number  of  directions  and  then  seek  a  con¬ 
sensus  among  the  solutions  |l7j. 


6.1.3.  Iterative  Registration 

1'  optic iv  fcov  t  inovi  approximately  tnei  lli«-  know  ledg'  cat  bi  us-:  t'  rtdtc*  tr  ■ 
error  n  t L f  loca'  optimization  technique  We  develoj  a  more  genera1  form  o'  the  gradiei  : 
constraint  eauatior  that  aohee  for  tbe  diflerence  between  at  approximate  estimate  and  th< 
actua’  flow  Our  derivation  abbreviates  aL  analysis  presetted  by  Paquin  and  I>ubois  lib 

Consider  tbe  image  sequence  that  samples  tbe  three-dimensional  image  function  -  a1 
pictured  in  Figure  1  \Ne  actually  estimate  the  displacement  of  a  point  between  successive 
samples  of  the  image  sequence  If  velocity  is  constant  then  tbe  displacement  observed  on  the 
image  over  the  time  interval  Af  is  (tA/.tAf)  Let  d  be  a  displacement  vector  m  S 
dimensional  i  ,j  ,i-space  Let  d  be  an  estimate  of  d  Given  a  displacement  estimat' 

.  i  At  3  -component  of  displacement 

d  e  «  At  =  y -component  of  displacement  (37j 

At  [< -component  of  displacement 

we  can  estimate  optical  flow  by  (t  ,f  ). 


Th  e  vector 


is  a  unit  vector  in  the  direction  of  the  estimated  displacement  Tne 


gradient  of  I  in  this  direction  is 


Mil  t  Mil 


-(l,i+I9i -I,*~l,r)At  (using  (3)) 


“  MT(,'<t+/'s,)i' 


where  £i=t-v  and  It  —t-t  are  the  errors  in  tbe  estimated  flow  velocities  Finally,  we  get 
an  expression  that  relates  the  error  in  the  displacement  estimate  to  measurable  brightness 
gradients 


|j  d  ||  1%  —  1,  6v  At  -+Jr6t  At  =  lf  Ad 

0 


bi  A  t 

Ad  =  d - d  =  H  At 
At 

We  can  compute  an  estimate  of  tbe  quantity  (41)  by  using  the  Taylor  expansion 


J(i  4t  At  ,j, +i  At  ,ty±At )  =  J(i  ,j  ,<,)  4  ||  d  |[  1% 


(43) 


I 


Soh  mg  for  1^  and  combining  with  I41l  yields  Uj<  fcpproxm.at  ru 

;j  ]  [h]Ar  5=7  ;<a"ri  <*/  't~*  At-h-rAt )  ~  H*  <1  ,<il  (44 ! 

The  new  constraint  equation  (41)  »s  a  more  genera!  form  of  the  gradient  constraint 
equation  The  more  genera!  form  relates  tbe  gradients  in  an  arbitrary  direction  to  the  spans’ 
gradients  and  optica!  flow.  If  the  displacement  estimate  is  (0,0,  At ).  then  1^—1, 

We  can  use  the  genera!  form  of  the  gradient  constraint  equation  to  rehne  ar  estimate  d 
by  solving  (44)  for  Ad  This  process  can  be  performed  iteratively  to  find  successively  better 
estimates  of  optica!  flow  An  improvement  can  be  expected,  on  the  average  whenever  suc¬ 
cessive  registrations  are  closer  to  the  true  displacement  vector 

||Ad1+1  |!  <  ||  Ad,  |l  i  =1.2,...  (45) 

The  improvement  arises  from  successively  better  estimates  of  1%  As  was  demonstrated  ear¬ 
lier  in  equation  (13)  the  systematic  error  in  the  estimate  of  temporal  derivative  grows  as  the 
square  of  flow  magnitude.  The  same  relationship  is  true  for  direction  derivative  &Dd  the 
flow  difference  in  the  general  constraint  equation. 

Solving  for  the  difference  between  an  estimate  of  optical  flow  and  the  true  optica!  flow 
is  computationally  equivalent  to  registering  a  portion  of  an  image  pair  and  estimating  the 
change  of  position  m  the  adjusted  sequence.  For  this  reason  the  technique  has  been  called 
iterative  registration  [5,  The  estimate  of  optical  flow  may  be  derived  from  estimates  made 
at  some  previous  time  or  from  prior  processing  on  a  single  frame  pair. 

Note  that  if  the  inequality  of  (45)  does  not  hold,  then  the  error  might  be  expected  to 
increase.  If  an  estimate  of  optica!  flow  is  poor  then  tbe  refinement  effort  may  lead  to  an  even 
larger  error.  The  next  section  is  devoted  to  methods  to  evaluate  the  quality  of  optica!  flow 
estimates.  A  measure  of  the  accuracy  of  a  flow  estimate  can  be  used  to  judge  w  hetber  or  not 
the  estimate  should  be  used  for  registration.  Alternatively,  the  degree  of  registration  can  be 
base  on  the  confidence  put  in  the  flow  estimate,  the  more  accurate  tbe  estimate  is  judged  to 
be,  the  more  that  the  frame  pair  should  be  adjusted  in  the  direction  of  the  estimate 

Tbe  iterative  registration  technique  caE  be  combined  with  variable  blurring  to  produce 
a  coarse- to- fine  system  for  estimating  optica!  flow  [5j.  Flow  is  roughly  estimated  with  an 
image  sequence  which  has  been  blurred  so  that  tbe  brightness  function  is  approximately 
linear  over  areas  the  size  of  the  maximum  expected  displacement.  The  coarse  estimate  of 
optical  flow  is  used,  at  each  point,  to  register  a  small  region  of  the  image  at  a  finer  level  of 
resolution.  This  process  is  repeated  at  successively  finer  levels  of  resolution. 

How  much  advantage  can  be  gained  from  iterative  registration?  The  spatial  variation 
of  optica!  flow  will  not  be  affected  by  registration.  Thus,  the  error  due  to  incompatibilities 
among  equations  in  the  linear  system  is  unaffected  by  iterative  registration.  Also,  the  esti¬ 
mate  of  the  directional  gradient  will  contain  some  amount  of  random  measu'ement  error 
even  if  successive  frames  are  in  perfect  registration.  The  propagation  of  these  errors  depends 
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jnr:  fa’-i'v  ut>oi  th<  conditioning  o'  |  G  j  .  which  is  not  influenced  by  registratioi  \\  <  cm  • 
do;  expert  to  reduc*  tht  error  in  d  belov  that  caused  by  random  error  n  1%  ar.r.  non 
constaDt  flow  through  iterative  registration 

WLiit  preforming  a  coarse- to- fine  registration  the  degree  of  blurring  at  each  stag* 
should  be  appropriate  to  the  expected  error  id  optical  Bov  at  the  next  more  coarse  level  of 
analyst-  Id  the  abseDce  of  knowledge  about  the  motions  of  individual  points  the  blurring 
must  be  performed  uniformly  across  the  image  While  the  error  will,  on  the  average,  be 
reduced  for  points  which  translate  significantly,  the  error  will  tend  to  be  increased  for  points 
which  are  stationary  or  move  very  little  No  benefit  is  obtained  by  linearizing  the  brightness 
function-at. stationary  regions  and  the  error  propagation  characteristics  are  worsened  Some 
of  the  accuracy  lost  at  stationary  regions  during  coarse  processing  might  be  recovered  at  fine- 
levels  but,  in  general,  tfie  best  estimates  could  be  obtained  at  a  fine  level  without  registra¬ 
tion  In  the  next  section  methods  are  developed  to  estimate  the  accuracy  of  optical  flow  esti¬ 
mates  This  information  caD  be  used  in  the  coarse- to- fine  Bystem  of  iterative  registration  to 
judge  whether  an  improvement  has  been  obtained  at  each  level  A  prion  estimates  of  the 
magnitude  of  flow  are  also  developed  in  the  Dext  section.  The  iterative  registration  technique 
can  be  improved  by  adapting  the  technique  to  knowledge  about  the  accuracy  of  estimates 
and  the  magnitude  of  motion. 


5.2.  Estimating  Error 

Many  of  the  factors  which  lead  to  errors  in  the  local  optimization  estimation  technique 
can  be  identified  and  measured  from  the  image.  The  error  propagation  characteristics  of  the 
linear  system  can  be  estimated  from  the  matrix  of  spatial  gradients  The  degree  to  which 
relative  errors  are  magnified  is  indicated  by  eonrf(G).  Regions  of  the  image  for  which  the 
propagation  characteristics  are  poor  will  be  very  sensitive  to  small  measurement  errors  in  the 
gradients  The  optical  flow  estimates  obtained  in  these  regions  are  likely  to  be  inaccurate 

The  systematic  measurement  error  in  J,  W'as  shown  to  depend  upon  the  linearity  of  the 
brightness  function  in  the  direction  of  motion  (13).  One  way  to  measure  of  the  nonlinearitv 
of  the  brightness  function  is  to  compare  the  spatial  gradients  of  brightness  in  successive 
frames  |2,  5;.  If  is  significantly  different  from  I,(z  ,t  +6t )  then  it  can  be  inferred 

that  the  estimate  of  the  temporal  gradient  is  likely  to  be  in  error. 

Once  an  estimate  has  been  obtained  we  can  bound  the  error  by  referring  back  to  the 
image.  The  following  opotteriori  error  bound  caD  be  derived  from  (44): 


II  V*  M)  ||  A/  > 


I(r  -+£  At  ,y  -t-f  At  )  -  I(t  ,y  ,t  j) 

11(4,/,)  II 


If  the  norm  of  the  spatial  gradient  is  not  too  small,  this  will  provide  a  good  measure  of  the 
magnitude  of  the  error  in  the  flow  estimate. 

If  an  over-determined  set  of  equations  is  used  to  estimate  optical  flow,  then  measure¬ 
ment  errors  in  the  gradients  and  incompatibilities  among  the  constraint  equations  due  to 


difif-ru.ii  a!  mot  tor.  wil.  t.f  TtLtruc  u  tin  residua!  of  tbf  solum:  Tb>  residua1  \ector  cai;  h- 
estimated  h> 

Gu'  -  b  =  r  (47  1 

where  u-  is  tb<  estimated  optical  fiow  and  r  is  tbe  residua!  A  large  residua!  indicates  tha: 

•  substantia!  errors  exist  in  tbe  system  and  that  tbe  estimated  flow  vector  is  like!)  to  be  ina'- 
curati 

Tbe  residua!  vector  will  be  especially  large  at  occlusion  edges  where  tbe  change  m  fiow 
is  discontinuous  It  has  been  proposed  that  the  residual  error  be  used  as  an  indication  of  tbe 
presence. of  an  occlusion  edge  llO  To  be  identifiable,  the  change  in  optica!  fiow  across  an 
occlusion  edge  must  lead  to  an  error  which  is  greater  than  that  normally  encountered  from 
other  measurement  errors  A  threshold  on  the  residual  must  be  established  which  wili  nor¬ 
mally  be  exceeded  only  at  significant  discontinuities  in  tbe  flow  field  Tbe  error  accrued  from 
a  change  in  the  fiow  vector  is  equivalent  to  a  measurement  error  on  the  right-hand  side  of 
tbe  local  optimization  system  Since  the  equivalent  error  on  the  right-hand  side  is  magnified 
by  the  size  of  the  spatial  gradients,  the  threshold  for  identifying  large  residual  errors  should 
be  adaptive  to  the  spatial  gradients.  Likewise,  it  was  shown  that  the  systematic  measure¬ 
ment  errors  in  the  gradients  were  related  to  the  second  derivatives  of  brightness,  so  the 
threshold  on  the  residual  should  depend  upon  the  second  derivatives,  as  well 

5.3.  Methods 

The  gradient-based  approach  is  demonstrated  with  two  versions  of  the  local  optimiza¬ 
tion  technique  The  basic  local  optimization  method  performs  a  least  squares  minimization 
on  an  over-determined  set  of  gradient  constraint  equations  to  estimate  optical  flow  at  each 
point  Each  image  is  first  blurred  with  a  gaussian  blurring  function  Tbe  standard  deviation 
of  tbe  blurring  function  used  to  collect  the  data  presented  here  was  about  2  pixels.  Tbe  blur¬ 
ring  serves  to  reduce  the  noise  in  the  image  and  linearize  the  brightness  function. 

Constraint  equations  from  a  group  of  neighboring  points  are  gathered  to  produce  an 
over-determined  system  of  linear  equations  of  the  form 

Gw  =  -b  (48) 

where, 


Each  row  of  G  and  b,  is  evaluated  at  a  different  point.  To  insure  that  the  equations  arc 
sufficiently  distinct  we  selected  neighbors  from  a  5X5  window  centered  around  the  point  to 
be  estimated 
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li  genera!  ttif  eve r-deiermined  system  (48  has  nc  ew;  solution  Ai  approximai' 
solution  is  found  by  minimizing  tbe  residua!  vector  r.  defines  it.  (47)  Tlie  bow  estimate  i; 
chosen  to  be  the  vector  u-  which  minimizes  some  criteria  function  of  r  In  our  work  we 
minimize  |i  r  | .  r  by  letting 

w  —  G"b  .  (50. 

where  G’  is  the  pseudo  inverse  of  G  |l5j.  Calculation  of  the  pseudo  inverse  requires  the 
inversion  of  the  2X2  matrix  G'G  The  inverse  will  not  exist  where  the  local  gradients  do  not 
sufficiently  constrain  optica!  flow  to  allow  for  an  exact  solution  In  this  case  the  confidence 
of  the  flow  estimate  is  set  to  zero  and  t  and  «  are  undefined 

A  confidence  is  assigned  to  each  flow  estimate  on  the  basis  of 

(a!  an  estimate  of  the  measurement  error  in  the  temporal  gradient, 

(b)  an  estimate  of  the  conditioning 

(c)  tbe  size  of  the  residual  vector  r,  and 
(d  i  the  a  posteriori  bound  given  by  (46) 

The  importance  of  each  of  these  factors  in  determining  the  accuracy  of  estimates  is  dis¬ 
cussed  above  That  analysis  does  not,  however,  provide  us  with  a  formula  for  estimating  the 
total  error  in  tne  flow  vector  (*  ,t  ).  We  must  find  a  means  to  combine  several  factors  which 
each  indicate  the  presence  of  conditions  which  can  be  lead  to  errors. 

Recall  how  each  factor  outlined  above  relates  to  the  error  in  (v  ,v).  The  systematic 
measurement  error  in  tbe  temporal  gradient  depends  on  tbe  linearity  of  the  brightness  func¬ 
tion.  The  change  in  the  spatial  gradients  between  successive  frames  provides  an  indication  of 
the  linearity  of  the  brightness  function  over  the  region  which  has  translated  by  a  point  [oj. 
To  obtain  an  estimate  of  the  contribution  of  this  error  to  errors  in  w,  we  divide  magnitude  of 
the  change  in  the  spatial  gradients  by  tbe  magnitude  of  the  spatial  gradient 

The  error  propagation  characteristics  of  tbe  linear  system  Gu>  =  b  can  be  determined  by 
examining  the  matrix  of  spatial  gradients.  If  linear  system  is  ill-conditioned,  small  measure¬ 
ment  errors  will  tend  to  produce  large  errors  in  (£  ,f  ). 

Tbe  residua!  vector  indicates  the  degree  to  which  tbe  estimated  flow  vector  jointly 
satisfies  the  system  of  constraint  equations.  But  tbe  value  of  the  residual  vector  is  not  easy 
to  interpret  because  the  size  of  the  residual  is  dependent  of  tbe  overall  magnitude  of  the 
brightness  gradients  W’e  normalize  the  residual  by  determining,  for  each  equation,  tbe 
minimum  distance  between  the  estimate  and  the  equation.  This  is  equal  to  the  distance 
between  the  estimate  and  the  constraint  equation  along  a  line  perpendicular  to  the  constraint 
equation  that  passes  through  the  estimate.  Tbe  average  minimum  distance  is  used  as  an 
index  of  the  degree  to  which  the  equations  are  satisfied. 

Once  an  estimate  ha-  been  obtained,  the  a  posteriori  error  bound  given  by  (46)  can  be 
used  to  judge  tbe  accuracy  of  the  estimate  In  locations  where  this  bound  is  large  the  com¬ 
puted  optica!  flow  vector  is  likely  to  be  in  error. 


La  !.  o'  ire  me  asure  me  nt'  cies:  rit.ca  *bov  <  \  ro'  iq<  ar  inco  c/  ttit  expected  o'  err  ••  t:  • 
how  estimate  1  tie  four  erro*  estimate?  are  no;  lndtpenae r.t  Tne  residua  erro*  ar.C  u  •  t. 
posteriori  t>oii n d  measure  the  accumulative  error,  frorr.  aJj  sources  it  the  flow  eslirr.at<  Tr< 
vanat  or.  il  th e  spaua!  gradient  and  the  conditioning  of  G  measure  conditions  whirl,  a*, 
likely  to  lead  to  poor  estimates  nonhnearny  in  the  spatial  brightness  function  is  particular^ 
troublesome  for  gradient  measurement  and  the  conditioning  of  G  conveys  the  error  propaga¬ 
tion  characteristics  of  the  linear  system  Even  though  the  four  estimates  are  not  independent 
we  found  that  they  were  best  treated  as  separate  sources  of  information  and  best  combined 
multiplicative!}  We  examined  a  number  of  combination  rules  and  found  that  the  result^ 
were  not  high!}  sensitive  to  the  particular  rule  for  combming  error  estimates  A  measure  of 
confidence  was  obtained  from  the  inverse  of  the  error  estimates  Tne  confidence  value  car  b« 
interpreted  as  a  rough  measure  of  the  likelihood  that  an  optica)  flow  estimate  is  correct 

5.3.1.  Local  Optimisation  with  Iterative  Registration 

Tne  simple  method  of  local  optimization  can  be  extended  by  a  method  or  iterative 
refinement  Flow  estimates  are  used  to  register  the  frame  pair  on  each  successive  iteranoi  of 
the  estimation  procedure.  It  was  earlier  showL  that  the  measurement  error  in  the  temporal 
gradient  could  be  significantly  reduced  if  the  registration  local!}  reduced  the  displacement  of 
the  image  frame?  Since  the  optica!  fiow  field  will  usual]}  contain  variations,  the  predicted 
registration  will  differ  across  the  image  To  obtain  a  consistent  linear  system  a  small  region 
of  the  first  frame  must  be  registered  with  the  second  frame  on  the  basis  of  the  predicted  fiow 
at  the  point  for  which  optica!  fiow  is  to  be  estimated  A  system  of  linear  equations  is  con¬ 
structed  from  the  registered  region 

This  process  can  be  performed  iteratively.  usiDg  the  optical  flow  estimates  at  the  previ¬ 
ous  stage  to  register  the  frame  pair  on  the  next  iteration  It  is  important  to  emphasize  that 
at  each  stage,  the  registration  can  only  be  expected  to  improve  performance  when  the  Dew 
registration  is  an  improvement  over  the  registration  in  the  last  iteration  Otherw  ise,  the  new 
estimate  of  optical  flow  will,  in  genera!,  be  worse  then  the  previous  estimate  Since  it  is 
desirable  to  register  the  image  only  where  the  flow  estimates  are  believed  to  be  correct  we 
register  in  proportion  to  the  confidence  iD  the  flow  estimate  A  fiow  field  of  zero  flow  vectors 
is  used  to  initialize  the  first  iteration. 

The  iterative  registration  technique  is  employed  with  variable  blurring  to  produce  a 
coarse-to-fine  system  of  analysis.  Images  are  blurred  with  a  gaussian  weighting  function  In 
early  iterations  the  standard  deviation  of  the  gaussian  weighting  function  is  large  The  stan¬ 
dard  deviation  of  the  weighting  function  is  reduced  in  each  successive  iteration.  At  each 
level,  the  radius  of  the  blurring  function  should  be  large  enough  to  guarantee  that  the  bright¬ 
ness  function  is  approximately  linear  over  the  maximum  expected  flow  from  the  registered 
images 

The  size  of  the  neighborhood  from  which  the  constraint  equations  are  selected  must 
depend  upon  the  amount  which  the  images  are  blurred.  At  a  coarse  level  of  analysis  there  is 
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Inti*  deta;'  viiiri.  distinguishes  n<art>»  po;r  it  1  obta::  fr_  ffi:i<i.t:\  cifie  ret.: 
equations  tiif  separation  between  obse  nation  point'  mu?:  t><  me  reao  i  otnerwis-  tn<  c,i,ci- 
tiomrig  of  tb'  linear  system  will  degene  rat> 

Our  system  contains  four  iterations  which  correspond  u  fc>ur  levels  c»f  coarsened  Ttj' 
neighbor  size  and  the  value  of  the  standard  deviation  for  tbe  approximation  to  the  gaussiai 
weighting  function  are  given  m  table  2  for  each  of  tbe  four  iterations 


Iteration 

Blur  Radius  r 

Neighborhood  Size 

1 

7 

6 

O 

5 

4 

3 

3.5 

3 

4 

o 

2 

Table  2  The  coarse- to- fine  analysts 


A  difficulty  with  the  coarse- to- fine  system  is  that  the  flow  estimates  for  stationary  and 
slowly  moving  points  made  at  coarse  levels  may  be  worse  than  the  initially  assumed  zero  vec¬ 
tor  To  insure  that  the  Dew  flow  estimate  made  at  one  level  is  not  worse  than  the  value 
inDut  into  the  level,  we  examine  the  error  bound  given  by  (46)  for  both  the  initial  and  new 
estimates  If  the  error  bound  for  the  new  estimate  is  significantly  larger  than  the  bound  for 
the  old  estimate,  it  is  ignored 

5.4.  Results 

The  two  methods  described  above  were  tested  with  the  two  image  pairs  presented  in 
Figure  2  In  the  first  sequence  the  camera  was  stationary.  The  two  toy  trains  in  the  center 
of  the  first  image  move  toward  each  other  id  the  second  image.  The  second  sequence  simu¬ 
lates  a  view  from  an  aircraft  flying  over  a  city.  Tbe  scene  is  actually  a  model  of  downtown 
Minneapolis  (This  picture  originally  appeared  in  Barnard’s  thesis  (18].) 

The  optical  flow  fields  obtained  with  tbe  simple  local  optimization  technique  are  shown 
in  Figure  3. a  and  Figure  3  b  for  the  moving  trains  and  flyover  scenes  Associated  with  each 
vector  is  a  confidence  in  tbe  correctness  of  the  value.  A  threshold  on  confidence  was  esta¬ 
blished  which  produced  a  reasonably  dense  sampling  of  mostly  correct  values.  Only  vectors 
which  exceeded  the  confidence  threshold  are  displayed.  The  resulting  field  was  too  dense  to 
clearly  display  the  entire  field  Consequently,  only  209c  of  the  vector  fields  are  shown  in  Fig¬ 
ure  3 

The  results  of  the  coarse- to- fine  method  of  iterative  refinement  are  shown  in  Figures  3  c 
and  3  d.  Confidence  thresholds  were  established  which  produced  vector  densities  which  were 
comparable  to  that  obtained  with  simple  local  optimization  Both  techniques  produce  rea¬ 
sonably  accurate  results  with  the  moving  train  sequence 


Ttif  two  techniques  are  more  easily  distinguished  oi.  th*  basis  of  their  performance  v  iU 
the  flyover  sequence  T tie  6impi<  k>ra:  optimization  rnetnod  produce.*-  a  large  numne:  of 
error*-  eveL  for  the  relatively  sparse  sampling  of  vectors  displayed  in  Figure  3  b  Tne  method 
of  iterative  registration  generated  many  fewer  errors  in  fields  which  are  much  more  dens- 
than  that  obtained  with  the  6imple  local  optimization  approach 

Note  the  areas  where  very  few  vectors  are  displayed  Optical  flow  is  poorly  estimated  m 
thes-  regions  and  low  values  of  confidence  are  assigned  to  the  estimates  obtained  there  The 
problematic  regions  are  usually  fit  into  one  or  more  of  the  following  characterizations 

1  largely  homogeneous  regions, 

2  highly  textured  regions  which  are  moving,  or 

3  regions  which  contain  large  discontinuities  in  the  flow  field 

Optical  flow  estimates  obtained  in  homogeneous  areas  are  likely  to  be  in  error  because  of  the 
poor  conditioning  of  linear  systems  constructed  in  these  regions  The  temporal  gradient  is 
poorly  measured  in  highly  textured  regions  which  undergo  significant  motion  In  regions 
which  contain  large  discontinuities  in  the  flow  field  the  temporal  gradient  is  poorly  estimated 
and  the  systems  of  equations  from  the  region  aTe  likely  to  contain  inconsistencies 

The  success  with  which  confidence  estimates  predict  the  accuracy  of  flow  estimates  is 
demonstrated  in  Figure  4.  The  flow  field  produced  by  the  simple  local  optimization  tech¬ 
nique  with  the  moving  trains  sequence  is  displayed  in  with  a  low  threshold  on  confidence  in 
Figure  4  a  and  a  high  threshold  in  Figure  4.b.  As  before,  only  20%  of  the  vectors  which 
exceed  the  threshold  are  displayed.  Similar  thresholds  are  shown  for  the  method  of  iterative 
registration  in  Figures  4.c  and  4.d.  For  both  methods  confidence  provides  a  reasonable  mdex 
of  the  accuracy  of  flow  estimates  A  sparse  sampling  of  accurate  estimates  exceeds  the  high 
confidence  threshold  When  the  threshold  is  lowered,  more  dense  fields  are  obtained  with  a 
significantly  greater  number  bad  vectors 

6.5.  Summary. 

The  gradient  constraint  is  a  powerful  tool  for  the  analysis  of  dynamic  imagery  .  Careful 
examination  of  one  gradient-based  technique  led  to  a  number  of  conclusions  about  the  causes 
of  errors,  provided  support  for  techniques  to  improve  estimates,  and  indicated  methods  by 
which  the  accuracy  of  estimates  could  be  judged  This  analysis  suggests  that  optical  flow 
estimation  should  be  adaptive  to  the  nature  of  the  brightness  function  and  the  characteristics 
of  motion  in  a  region  of  the  image 

The  results  demonstrate  the  feasibility  of  measuring  the  quality  of  optical  flow  esti¬ 
mates  Gradient-based  techniques  are  susceptible  to  a  variety  of  problems  and  tend  to  pro¬ 
duce  very  poor  estimates  in  troublesome  areas  of  the  image  Without  accurate  estimates  of 
confidence,  good  estimates  can  not  be  distinguished  from  bad  and  gradient-based  techniques 
are  of  little  use.  Thu-  work  emphasizes  the  importance  of  understanding  the  mechanisms 
which  underlie  computational  methods  An  awareness  of  the  strengths  and  weaknesses  of 


methods  arc  of  the  way  it  which  they  operate  cat  lead  u  adaptations  ar.c  cnt.  anceme tip- 
which  are  of  great  praotica1  value 


Appcndu  A  Optical  Flov  Variation} 

Several  papers  have  examined  the  relationship  between  the  three-dimensional  motion  of 
objects  and  observers  and  the  characteristics  of  the  optical  flow  field  We  will  consider  an 
example  which  allows  arbitrary  three-dimensional  translation  of  a  planar  surface  to  demon¬ 
strate  the  important  factors  influencing  changes  in  optical  flow  over  the  image. 

Lct'the  three-dimensional  coordinate  system  be  attached  to  the  camera  as  in  Figure  A  ] 
which  is  redrawn  from  Longuet-Higgins  and  Prazdny  [10,.  All  motion  is  associated  with  the 
camera  Let  V ,  V,  and  H  be  the  translational  velocities  of  the  observer  in  the  A',  V,  and  Z 
directions  When  motion  is  constrained  to  translation,  the  components  of  the  three- 
dimensional  velocity  vector  are 

X'  Y>  =  -V  Z'  =-W  .  (A.l) 


Using  a  perspective  projection,  the  position  of  an  object  point  on  the  image  is  related  to  its 
three-dimensional  position  by 
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UL 

z 


(A.2) 


where  /  is  the  focal  length  of  the  camera  Velocity  on  the  image  plane,  («,»•),  at  a  point 
(*  )  * 

%  —  z'  t  =  y '  (A. 3) 


Substituting  from  (A.2)  into  the  right-hand  side  of  (A.3)  and  differentiating  we  obtain 


X'  XZ'  =  -fU  4  xW 

Z  Z*  Z 


(A  4) 


and 


-fV  ±  yW 
Z 


(A  5) 


Consider  a  point  Pc  on  the  surface  of  a  rigid  body  which  projects  to  p;,  on  the  image 
We  orient  the  coordinate  system  so  that  P0  lies  on  the  observer’s  line  of  sight  The  three- 

Tbif  coord  mate  transformation  is  cot  strictly  correct  tor  a  planar  retina  as  picture  in  Figure  A  1 
Tbe  change  of  coordinate'  car,  be  justified  in  several  ways  It  caD  be  assumed  that  tbe  retina  is 
globally  spherical  but  can  locally  be  modeled  as  planar  Or,  it  can  be  assumed  that  the  distance  o' 
p0  trom  the  origin  is  sufiiciently  small  relative  to  tbe  local  length  that  tbe  distortion  introduced  by 
the  transform  will  be  minima!  Or  finally,  we  can  simply  restrict  our  attention  to  point  aloDg  tbe 
line  of  sight 


dimensions  coordinates  of  P(  are  (0.0. R  |  &ij d  tbe  po6itioL  o'  p,  ol  the  imagi  is  (D.C> .  \\< 

assurnt  that  the  surface  is  planar  so  that 

Z{X,Y)=  R  ~  ol  -  0Y  (A  f./ 

for  points  on  the  surface  near  Pc 

Following  Longuet-Higgms  apd  PrazdDy,  we  introduce  the  dimensionless  coordinate 


Ol  4ft  . 

(A") 

The 

components  of  optical  flow  formalized  in  (A  4)  and  (A. 5)  can  be  rew  ritten  ac 

1-fV  4  rM  1 

(,  *1 

(A. 8) 

%  1  A  | 

I’-j) 

and 

1-7) 

(A.P) 

The 

surface  is  i 

assumed  to  be  planar,  so  the  derivatives  of  v  and  t  with  respect  to 

r  and  y 

are  well  defined 

At  the  point  p0,  where  r  =  y  =  z 

=  0,  «  and  r  are 

IV  A 

v  = - —  and  » 

JV 

R 

(A  10) 

The 

derivatives 

of  v  and  t  are  given  by 

oV  4  W 

0V 

**  R 

*•  =~R 

(All) 

oY  , 

x,  =  —  end ,  v. 

0V+W 

R 

(A  12) 

since 

z,  =  a  and 

=  0 

(A.13) 

Recall  that  the  error  incurred  by  assuming  constant  flow  could  be  treated  as  measure 
ment  error  in  ,  on  the  right-hand  side  of  (18).  The  magnitude  of  this  error,  relative  to  J, ,  is 
strongly  dependent  on  the  ratio  of  the  magnitude  of  the  cbaDge  in  optical  flow  to  the  magni¬ 
tude  of  the  flow  vector.  We  can  now  express  the  ratio  of  change-of-fiow  to  flow  in  terms  of 
the  three-dimensional  parameters  of  shape  and  motion,  and  the  viewing  angle.  The  change 
in  optical  flow  between  two  points  separated  by  (Ar,Ay)  is 

(Au  ,  At  )  =  (Art,  4  Aye t  Art,  4  Ayr,  )  .  (A.  14) 

Inserting  the  appropriate  terms  from  (A  ll)  and  (A. 12)  rnt-o  (A. 14)  and  dividing  by  optica! 
flow  as  given  by  (A. 10),  we  arrive  at  an  expression  for  the  ratio  of  change-of-fiow  to  flow  at  a 
point 
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when , 


At 

Ay 


(A  .IS) 


The  angle  q  b  the  angle  subtended  by  (At, Ay)  with  a  focal  length  of  /  ;  this  b  simply  the 
size  of  the  neighborhood  measured  in  degrees  of  vbual  angle.  The  length  of  the  change-of- 
flow  vector  relative  to  the  length  of  the  flow  vector  depends  upon  the  size  of  the  neighbor¬ 
hood,  the  slope  of  the  surface  viewed,  and  the  ratio  of  velocity  along  the  line  of  sight  to  velo¬ 
city  perpendicular  to  the  line  of  sight 
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Figure 

1 . 

The  sampled  image  function. 

Figure 

2. 

Image  Sequences. 

Figure 

3. 

Optica]  flow  estimates. 

Figure 

4  . 

The  accuracy  of  confidence  estimates.  Optical 
estimates  exceeding  low  and  high  thresholds  on 

confidence  are  displayed. 

Figure  A .  1  The  camera-based  coordinate  system. 
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(a)  simple  local  optimization 


(b)  simple  local  optimization 


(c)  iterative  registration 


(d)  iterative  registration 


Figure  3. 
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Dynamic  Occlusion  Analysis  in  Optical  Flow  Fields 
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and  VALDIS  A.  BERZIN'S,  member,  ieee 


Abstract  Optical  flow  can  be  used  to  locttedynamic  occlusion  bound¬ 
aries  in  an  image  sequence.  Y/e  derive  an  edge  detection  algorithm  sensi¬ 
tive  to  changes  in  flow  fields  likely  to  be  associated  with  occlusion.  The 
algorithm  is  patterned  after  the  Marr-Hildreth  zero-crossing  detectors 
currently  used  to  locate  boundaries  in  scalar  fields.  Zero- crossing  de¬ 
tectors  arc  extended  to  identify  changes  in  direction  and/or  magnitude 
in  a  vector-valued  flow  field.  As  a  result,  the  detector  works  for  flow 
boundaries  generated  due  to  the  relative  motion  of  two  overlapping 
surfaces,  as  well  as  the  simpler  case  of  motion  parallax  due  to  a  sensor 
moving  through  an  otherwise  stationary  environment.  We  then  show 
how  the  approach  can  be  extended  to  identify  which  side  of  a  dynamic 
occlusion  boundary  corresponds  to  the  occluding  surface.  The  funda¬ 
mental  principal  involved  is  that  at  an  occlusion  boundary,  the  image 
of  the  surface  boundary  moves  with  the  image  of  the  occluding  surface. 
Such  information  is  important  in  interpreting  dynamic  scenes.  Results 
are  demonstrated  on  optical  flow  fields  automatically  computed  from 
real  image  sequences. 

Index  ferms-Dynamic  occlusion,  dynamic  scene  analysis,  edge  de 
tec tion,  optical  flow,  visual  motion. 


1.  Introduction 

AN  optica!  flow  field  specifies  the  velocity  of  the  image 
of  points  on  a  sensor  plane  due  to  the  motion  of  the  sen- 
sot  and/or  visible  objects.  Optical  flow  can  be  used  to  estimate 
aspects  of  sensor  and  objecl  motion,  the  position  and  orienta¬ 
tion  of  visible  surfaces  relative  to  the  sensor,  and  the  relative 
position  of  different  objects  in  the  field  of  view  .  As  a  result, 
the  determination  and  analysis  of  optical  flow  is  an  important 
part  of  dynamic  image  analysis.  In  this  paper,  we  develop  an 
operator  for  finding  occlusion  boundaries  in  optical  flow  fields. 
We  deal  exclusively  with  dynamic  occlusions  in  which  flow 
properties  differ  on  either  side  of  the  boundary.  The  operator 
is  effective  fot  both  motion  parallax,  when  a  sensor  is  moving 
through  an  otherwise  stationary  environment,  and  for  more 
general  motion  in  which  multiple  moving  objects  can  be  in  the 
field  of  view .  The  multiple  moving  object  situation  is  more  dif¬ 
ficult  because  boundaries  are  marked  by  almost  arbitrary  com¬ 
binations  of  changes  in  magnitude  and/or  direction  of  flow  . 
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The  technique  is  extended  so  that  a  determination  may  be 
made  about  which  side  of  a  dynamic  occlusion  boundary  corrc- 
sponds  to  ihe  occluding  surface.  Such  a  determination  is  of 
great  importance  for  interpreting  the  shape  and  spatial  organi 
zaiion  of  visible  surfaces.  Results  are  demonstrated  on  real 
image  sequences  with  (low  fields  computed  using  the  token 
matching  technique  described  in  J )  ] .  Reliability  is  obtained 
by  dealing  only  with  methods  able  10  integrate  flow  field  in¬ 
formation  over  relatively  large  neighborhoods  so  as  to  reduce 
the  intrinsic  noise  in  fields  determined  front  real  image 
sequences. 

II  Boundary  Dftiction 

Conventional  edge  operators  detect  discontinuities  in  image 
luminence.  These  discontinuities  are  difficult  to  interpret, 
however,  because  of  the  large  number  of  factors  that  can  pro¬ 
duce  luminence  changes.  Boundaries  in  optical  flow  can  arise 
from  many  fewer  causes  and,  hence,  are  often  more  informa¬ 
tive.  If  a  sensor  is  moving  through  an  otherwise  static  scene,  a 
discontinuity  in  optical  flow  occurs  only  if  there  is  a  discon¬ 
tinuity  in  the  distance  from  the  sensor  to  the  visible  surfaces 
on  either  side  of  the  flow  boundary  [2],  Discontinuities  in 
fiow  will  occur  for  all  visible  discontinuities  in  depth,  except 
for  viewing  angles  directly  toward  or  away  from  the  direction 
of  sensor  motion.  If  objects  are  moving  with  respect  to  one 
another  in  the  scene,  then  all  discontinuities  in  optical  flow 
correspond  either  to  depth  discontinuities  or  surface  bound¬ 
aries,  and  most  depth  discontinuities  correspond  to  flow 
discontinuities. 

The  use  of  local  operators  to  detect  discontinuities  in  optica! 
flow  has  been  suggested  by  others.  Nakayama  and  Loomis  [3] 
propose  a  “convexity  function"  to  detect  discontinuities  in 
image  plane  velocities  generated  by  a  moving  observer.  Their 
function  is  a  local  operator  with  a  centei-suiround  form.  That 
is,  the  velocity  integrated  over  a  band  surtounding  the  center 
of  the  region  is  subtracted  from  the  velocity  integrated  over 
the  center .  The  specifics  of  the  operatoi  are  not  precisely 
slated,  but  a  claim  is  made  [3,  Fig  3]  that  theopciatoi  returns 
a  positive  value  at  flow  discontinuities.  (In  fact,  most  reason¬ 
able  formulations  of  their  operator  would  y  teld  a  value  of  0  at 
the  boundary,  with  a  positive  value  to  one  side  or  t fie  other.) 
Clocksin  [2]  develops  an  analysis  of  optical  flow  fields  gen¬ 
erated  when  an  observer  translates  in  a  static  environment.  He 
shows  that,  in  such  circumstances,  discontinuities  in  the  mag 
nilude  of  flow  can  be  delected  with  a  Laplacran  operator.  In 
particular,  singularities  in  the  Laplacian  occur  at  discontinuities 
in  the  (low.  He  also  showed  that,  in  this  restricted  environ- 
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meni,  the  magnitude  of  optical  flow  at  a  particular  image  point 
is  inversely  proportional  to  distance,  and  the  distances  can  be 
recovered  to  within  a  scale  factor  of  observer  speed.  It  is  thus 
trivial  to  determine  which  of  two  surfaces  at  an  edge  is  occlud¬ 
ing,  for  example,  by  simply  comparing  magnitudes  of  the  two 
surface  velocities,  even  when  observer  speed  is  unknown. 

For  this  restricted  situation  in  which  a  sensor  moves  through 
an  otherwise  static  world 
-  /,(x) 

fhw(x)  =  fAx)4  (1) 

where  at  an  image  point  x  ,flow(x)  is  the  optical  flow  (a  two- 
dimensional  vector),  f,  is  the  component  of  the  flow  due  to 
the  rotation  of  the  scene  with  respect  to  the  sensor,/,  is  de¬ 
pendent  on  the  translational  motion  of  the  sensor  and  the 
viewing  angle  relative  to  the  direction  of  translation,  and  r  is 
the  distance  between  the  sensor  and  the  surface  visible  at  x 
(4)  For  a  fixed  x ,  flow  varies  inversely  with  distance.  Both 
/,  and  /,  vary  slowly  (and  continuously)  with  x .  Discontinu¬ 
ities  in  flow  thus  correspond  to  discontinuities  in  r.  Further¬ 
more,  it  is  sufficient  to  look  only  for  discontinuities  in  the 
magnitude  of  flow.  This  relationship  holds  only  for  relative 
motion  between  the  sensor  and  a  single,  rigid  structure.  When 
multiple  moving  objects  are  present,  (1)  must  be  modified  so 
that  there  is  a  separate /<'>  and  /$'’  specifying  the  relative  mo¬ 
tion  between  the  sensor  and  each  rigid  object.  Discontinuities 
associated  with  object  boundaries  may  now  be  manifested  in 
the  magnitude  and/or  direction  of  flow. 

Boundary  detectors  for  optical  flow  fields  should  satisfy  two 
criteria:  1 )  sensitivity  to  rapid  spatial  change  in  one  or  both  of 
the  magnitude  and  direction  of  flow ,  and  2)  operation  over  a 
sufficiently  large  neighborhood  to  reduce  sensitivity  to  noise 
in  computed  flow  fields.  It  is  desirable  to  achieve  the  second 
criterion  without  an  unnecessary  loss  of  spatial  resolution  in 
locating  the  boundary  or  a  need  for  postprocessing  to  reduce 
the  width  of  detected  boundaries.  The  zero  crossing  detectors 
of  Man  and  Hildreth  [5]  may  be  extended  to  optical  flow 
fields  in  a  manner  that  achieves  both  objectives  [6] .  For  scalar 
fields  (e.g.,  intensity  images),  zero-crossing  edge  detection  pro¬ 
ceeds  as  follows.  1 )  Smooth  the  field  using  a  symmetrical  Gauss¬ 
ian  kernel.  2)  Compute  the  Laplacian  of  the  smoothed  func¬ 
tion.  3)  Look  for  directional  zero  crossings  of  the  resulting 
function  (e.g.,  look  for  points  at  which,  along  some  direction, 
the  function  changes  sign).  Under  a  set  of  relatively  weak  as¬ 
sumptions,  these  zero  crossings  can  be  shown  to  correspond  to 
points  of  most  rapid  change  in  some  direction  in  the  original 
function.  The  convolution  with  a  Gaussian  provides  substan¬ 
tial  noise  reduction  and,  in  addition,  allows  tuning  of  the 
method  for  edges  of  a  particular  scale.  Steps  I )  and  2)  involve 
evaluating  the  function  V3C  * /,  where  G  is  a  Gaussian  kernel, 
•  is  the  convolution  operation,  and  /  is  the  original  image ,  The 
effect  of  the  VJG  operator  can  be  approximated  by  blurring 
the  original  function  with  two  different  Gaussian  kernels  of 
appropriate  standard  deviation,  and  then  taking  the  difference 
of  the  result.  This  formulation  results  in  computational  simpli¬ 
fications  (7) ,  (8]  and  also  corresponds  nicely  to  several  phys 
lologica!  models  that  have  been  proposed  for  early  visual 
processing. 


The  effect  of  this  approach  is  to  idcntily  edg  points  when 
the  intensity  of  the  blurred  image  is  loyally  steepest.  Mote 
precisely,  an  edge  can  be  defined  as  a  peak  in  the  fust  direc¬ 
tional  derivative,  or  as  a  zero  crossing  in  the  second  directional 
derivative.  At  an  edge,  the  second  directional  derivative  has 
zero  crossings  in  almost  all  directions,  but  the  preferred  direc¬ 
tion  is  normal  to  the  locus  of  the  zero  crossings,  which  is  the 
same  as  the  direction  where  the  zero  crossing  is  steepest  for 
linearly  varying  fields  [5],  For  vector  images  such  as  optical 
flow  fields,  the  directional  derivatives  arc  vector  valued,  and 
we  want  the  magnitude  of  the  first  directional  derivative  to 
have  a  peak. 

This  extension  to  two-dnnensional  flow  fields  is  relatively 
straightforward.  The  optical  flow  field  is  first  split  into  sepa¬ 
rate  scalar  components  corresponding  to  motion  in  the  x  and 
y  directions.  The  operator  is  applied  to  each  of  these 
component  images,  and  the  results  combined  into  a  component¬ 
wise  Laplacian  of  the  original  flow  field.  (The  Laplacian  is  a 
vector  operator  which  can  be  expressed  in  arbitrary  coordinate 
systems.  For  convenience,  we  choose  a  Cartesian  coordinate 
system.)  This  componentwise  Laplacian  operation  is  imple¬ 
mented  by  subtracting  two  componentwise  blurred  versions  of 
the  original.  With  the  proper  set  of  weak  assumptions,  discon¬ 
tinuities  in  optical  flow  correspond  to  zeros  in  both  of  these 
component  Laplacian  fields.  At  least  one  of  the  components 
will  have  an  actual  zero  crossing.  The  other  will  have  either  a 
zero  crossing  or  will  have  a  constant  zero  value  in  a  neighbor¬ 
hood  of  the  discontinuity.  If  the  componentwise  Laplacians 
are  treated  as  a  two-dimensional  vector  field,  discontinuities 
are  indicated  by  directional  reversals  in  the  combined  field. 
Because  of  the  discrete  spatial  sampling  and  a  variety  of  noise 
sources,  the  zeros  or  zeio  crossings  in  the  two  components  of 
the  field  may  not  actually  be  exactly  spatially  coincident .  Thus, 
exact  reversal  is  not  expected,  and  a  range  of  direction  changes 
of  about  180°  is  accepted.  A  threshold  on  the  sum  of  the  vec¬ 
tor  magnitudes  at  the  location  of  the  flip  is  used  to  ensure  that 
the  zero  crossing  is  of  significant  slope.  This  is  analogous  to 
the  threshold  on  zero-crossing  slope  which  is  often  used  in  prac¬ 
tice  when  zero-crossing  techniques  are  used  on  intensity  im¬ 
ages,  and  serves  to  filter  out  small  discontinuities. 

The  approximations  made  by  the  computations  described 
above  will  be  good  if  the  variation  of  the  field  parallel  to  the 
edge  is  much  more  uniform  than  the  variation  normal  to  the 
edge.  For  scalar  images,  exact  results  will  be  obtained  if  the 
intensity  varies  at  most  linearly  along  the  edge  contour  (5). 
For  vector  images,  the  field  must  vary  at  most  linearly  in  some 
neighborhood  of  the  edge  contour,  so  that  the  assumptions  re¬ 
quired  are  slightly  stronger  than  for  scalar  images.  Appendix  1 
contains  the  analysis  for  the  case  of  vector  images. 

Two  examples  of  this  technique  applied  to  real  images  are 
shown  below.  In  both  examples,  the  objects  are  toy  animals 
with  flat  surfaces,  shown  moving  in  front  of  a  textured  back¬ 
ground.  In  Fig  1(a).  the  tiger  translates  parallel  to  the  image 
plane  from  right  to  left  between  frames  1  and  2.  The  elephant 
rises  off  its  front  legs  between  frames  1  and  2,  effectively  ro¬ 
tating  about  an  axis  at  its  hind  feet  oriented  perpendicularly  to 
the  image  plane  The  elephant  also  translates  slightly  to  the 
left  parallel  to  the  image  plane.  The  optical  flow  vectors, shown 


II  I  1  1  RANSAHIONS  ON  I'AHI  KN  ANAL  YSIS  ANH  MAC  HINl  IM1  1  LK.I  N(  I  .  VOl  PAM  I  *.  No  4  ,  Jl  I  Y  1910 


yit 


(b) 


Fig  I  (a)  Image  pail  (b)  Optica!  flow  (c)  Detected  edge  overlaid  onto 
flow  field  (d)  Dectectcd  edge  overlaid  onto  first  frame  of  sequence 

in  Fig.  1(b),  were  obtained  by  relaxation  labeling  token  match¬ 
ing,  as  described  in  |lj.  Notice  that  the  flow  vectors  on  the 
elephant  and  tiger  have  approximately  the  same  magnitude  but 
differ  in  direction.  Each  component  of  this  flow  field  wascon- 
volved  with  approximated  Gaussians  of  standard  deviations 
3.65  and  5.77.  The  ratio  of  these  standard  deviations  is  1  : 1 .6. 
The  two  convolved  flow  fields  were  subtracted,  and  the  re¬ 
sulting  vector  field  was  searched  for  reversals  in  vector  direc¬ 
tion,  A  boundary  strength  threshold  was  chosen  to  eliminate 
noise  points  due  to  small,  local  variations  in  estimated  flow  . 
In  Fig  1(c),  the  points  where  reversals  were  found  are  shown 


Fig  2  (a)  Image  pau  (b)  Optical  flow  .  (c)  Detected  edge  overlaid  onto 
flow  field  (d)  Detected  edge  overlaid  onto  first  frame  of  sequence 

overlaid  on  the  original  flow  field,  and  in  Fig.  1(d)  the  points 
are  overlaid  in  white  on  the  first  image  of  the  pair.  The  edge 
points  form  a  good  boundary  between  the  discontinuous  opti¬ 
cal  flow  vector  fields  [Fig.  1(c)]  ;  but  because  these  fields  are 
so  sparse,  the  edge  points  match  only  the  approximate  loca¬ 
tions  of  the  true  edges  [Fig.  1(d)] . 

In  Fig.  2(a),  both  Ihe  tiger  and  elephant  are  translating  to 
the  right,  parallel  to  the  image  plane  between  frames  1  and  2. 
The  flow  field  shown  in  Fig.  2(b)  was  obtained  in  the  same 
manner  as  in  Fig  1(b).  The  direction  of  the  flow  vectors  on 
both  animals  is  approximately  the  same,  but  there  is  a  dis- 
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continuii>  in  magmlude  Two  Gaussian  filteied  versions  c>f 
the  flow  fields  weie  obtained  w  ith  standard  deviations  of  3.16 
and  5.16  a  ratio  of  1  : 1 .6.  The  locations  of  vector  reversals 
resulting  from  differencing  the  two  filtered  fields  are  shown 
in  Fig.  2(c)  and  (d). 

The  width  of  the  Gaussian  kernel  used  in  the  V3G  operator, 
the  density  of  the  computed  optical  flow  field,  and  the  spatial 
variability  of  flow  all  interact  to  affect  the  performance  of  the 
boundary  detection.  As  with  the  use  of  zero-crossing  detectors 
for  scalar  fields,  it  may  be  desirable  to  use  a  range  of  kernel 
sizes  and  then  combine  the  results  to  obtain  a  more  robust  in¬ 
dicator  for  the  presence  of  a  boundary.  While  zero-crossing 
contours  are,  in  principle,  connected,  the  use  of  a  threshold 
on  the  slope  at  the  zero  crossing  results  in  some  portions  of 
the  boundary  being  missed.  In  practice,  zero-crossing  bound¬ 
ary  detection  for  both  scalar  and  vector  fields  often  requires 
such  thresholds  to  avoid  significant  problems  with  false  bound¬ 
ary  indications  in  slowly  varying  regions  of  the  fields.  Work 
still  needs  to  be  done  on  better  techniques  for  selecting  zero 
crossings  that  correspond  to  true  boundaries. 


III.  Identifying  Occluding  Surfaces 

When  analyzing  edges  between  dissimilar  image  regions  that 
arise  due  to  occlusion  boundaries,  it  is  important  to  determine 
which  side  of  the  edge  corresponds  to  the  occluding  surface. 
Occlusion  boundaries  arise  due  to  geometric  properties  of  the 
occluding  surface,  not  the  occluded  surface.  Thus,  while  the 
shape  of  the  edge  provides  significant  information  on  the  struc¬ 
ture  of  the  occluding  surface,  it  says  little  or  nothing  about  the 
structure  of  the  surface  being  occluded.  In  situations  where  a 
sensor  is  translating  through  an  otherwise  static  scene,  any  sig¬ 
nificant  local  decrease  in  r  in  (1)  increases  the  magnitude  of 
flow.  Thus,  at  a  flow  boundary,  the  side  having  the  larger  mag¬ 
nitude  of  flow  will  be  closer,  and  thus  will  be  occluding  the 
farther  surface.  Sensor  rotation  complicates  the  analysis,  while 
if  objects  in  the  field  of  view  move  with  respect  to  each  other, 
there  is  no  direct  relationship  between  magnitude  of  flow  and 
r.  Surfaces  corresponding  to  regions  on  opposite  sides  of  a 
boundary  may  move  in  arbitrary  and  unrelated  ways.  However, 
by  considering  the  flow  values  on  either  side  of  the  boundary 
and  the  manner  in  which  the  boundary  itself  changes  over  time, 
it  is  usually  possible  to  find  which  side  of  the  boundary  corre¬ 
sponds  to  the  occluding  surface,  although  the  depth  to  the  sur¬ 
faces  on  either  side  cannot  be  determined. 

The  principle  underlying  the  approach  is  that  the  image  of  the 
occluding  contour  moves  with  the  image  of  the  occluding  sur¬ 
face.  Fig  3  illustrates  the  effect  for  simple  translational  mo¬ 
tion.  Shown  on  the  figure  are  the  optical  flow  of  points  on 
each  surface  and  the  flow  of  points  on  the  image  of  the  bound¬ 
ary.  In  Fig  3(a),  the  left  surface  is  in  front  and  occluding  the 
surface  to  the  right.  In  Fig.  3(b),  although  the  flow  values  asso¬ 
ciated  with  each  surface  arc  the  same,  the  left  surface  is  now 
behind  and  being  occluded  by  the  surface  to  the  right.  The 
occluding  surface  cannot  be  determined  using  only  the  flow 
in  the  immediate  vicinity  of  the  boundary.  The  two  cases  can 
be  distinguished  because,  in  Fig.  3(a),  the  flow  boundary  deter¬ 
mined  by  the  next  pair  of  images  will  be  displaced  to  the  left, 
while  in  Fig  3(b)  it  will  be  displaced  to  the  right. 


Fif  3.  Optical  flow  at  a  boundary  at  two  instants  in  time,  (a)  Surface 
to  the  left  is  in  front,  (b)  Surface  to  the  iipht  is  in  front. 

To  formalize  the  analysis,  we  need  to  distinguish  the  optical 
flow  of  the  boundary  itself  from  the  optical  flow  of  surface 
points.  The  flow  of  the  boundary  is  the  image  plane  motion 
of  the  boundary,  which  need  not  have  any  direct  relationship 
to  the  optical  flow  of  regions  adjacent  to  the  boundary.  The 
magnitude  of  the  optical  flow  of  boundary  points  parallel  to 
the  direction  of  the  boundary  typically  cannot  be  determined, 
particularly  for  linear  sections  of  boundary.  Thus,  we  will  limit 
the  analysis  in  this  section  to  the  component  of  optical  flow 
perpendicular  to  the  direction  of  the  image  of  occlusion  bound¬ 
aries.  As  a  result,  if  the  flow  on  both  sides  of  the  boundary  is 
parallel  to  the  boundary,  the  boundary  will  still  be  detectable, 
but  the  method  given  here  will  provide  no  useful  information 
about  which  surface  is  occluding. 

We  can  now  state  the  basic  principle  more  precisely.  Choose 
a  coordinate  system  in  the  image  plane  with  the  origin  at  a  par¬ 
ticular  boundary  point  and  the  x  axis  oriented  normal  to  the 
boundary  contour,  with  x  >  0  foi  the  occluding  surface.  The 
camera  points  in  the  r  direction,  and  the  image  plane  is  at 
z  =  0.  Let  fx(x.  y)  be  the  x  component  of  optical  flow  at  the 
point  (x,  y).  Let  fb  be  the  x  component  of  the  flow  of  the 
boundary  itself  at  the  origin  (i.e.,fb  is  the  image  plane  veloc¬ 
ity  of  the  boundary  in  a  direction  perpendicular  to  the  bound¬ 
ary).  Then,  for  rigid  objects, 

fb=  lim  fx(x,0)  =  fx(0,0).  (2) 

x  -*  0  ♦ 

We  will  show  that  this  relationship  is  true  for  arbitrary  rigid 
body  motion  under  an  orthographic  projection.  For  a  single 
smooth  surface,  perspective  projections  are  locally  essentially 
equivalent  to  a  rotation  plus  a  scale  change,  although  the  anal¬ 
ysis  is  more  complex.  Equation  (2)  specifies  a  purely  local  con¬ 
straint  and,  as  the  limit  is  taken  from  only  one  side  of  the 
boundary,  is  dependent  on  flow  values  on  a  single  surface. 
Thus,  the  limit  result  will  hold  as  well  for  perspective  projec¬ 
tions.  Algorithms  which  utilize  the  result  in  (2)  may  suffer, 
however,  if  properties  of  more  than  a  truly  local  area  of  the 
field  are  utilized.  The  instantaneous  motion  of  a  rigid  object 
relative  to  a  fixed  coordinate  system  can  be  described  with  re¬ 
spect  to  a  six-dimensional,  orthogonal  basis  set.  Three  values 
specify  translational  velocity,  the  other  three  specify  angular 
velocity.  These  six  coordinates  of  motion  can  be  conveniently 
classified  into  four  types:  translation  at  constant  depth,  transla¬ 
tion  in  depth,  rotation  at  constant  depth, and  rotation  in  depth. 
Translation  at  constant  depth  ls  translation  in  a  direction  par¬ 
allel  to  the  image  plane.  Translation  in  depth  is  translation  per- 


37» 


II  11  1  KANS.M  1  IONS  OS  I'Al  71  KN  AN  A1  VSI'-  AM!  MAC  HIM  IMIlllllVl  VO|  PAM  I  ‘  NO  «  JOIN  |9»< 


pendicular  to  the  image  plane.  Rotation  at  constant  depth  is 
rotation  around  an  axis  perpendicular  to  the  image  plane.  Ro 
tation  in  depth  is  rotation  around  an  axis  parallel  to  the  image 
plane.  Any  instantaneous  motion  can  be  described  as  a  com¬ 
bination  of  these  four  types.  For  orthographic  projections, 
translation  in  depth  has  no  effect  on  the  image.  Thus,  we  need 
to  show  that  the  above  relationship  relating  boundary  and  sur¬ 
face  flow  holds  for  the  three  remaining  motion  types. 

A  point  on  the  surface  of  an  object  in  the  scene  that  projects 
into  a  boundary  point  in  the  image  will  be  referred  to  as  a  gen 
erating  point  of  the  occlusion  boundary.  The  family  of  gen¬ 
erating  points  defines  a  generating  contour ,  which  lies  along 
the  extremal  boundary  of  the  object  with  respect  to  the  sen¬ 
sor.  For  both  translation  and  rotation  at  constant  depth,  the 
generating  contour  remains  fixed  to  the  occluding  surface  over 
time.  Thus,  the  boundary  and  adjacent  points  move  with  ex¬ 
actly  the  same  motion.  As  a  result,  the  projection  of  the  sur¬ 
face  flow'  in  the  direction  normal  to  a  particular  boundary  point 
is  identical  to  the  projection  of  the  boundary  flow  in  the  same 
direction.  (The  result  is  strictly  true  only  for  instantaneous 
flow.  Over  discrete  time  steps,  boundary  curvature  will  affect 
the  projected  displacement  of  the  boundary.) 

The  analysis  of  rotation  in  depth  is  complicated  by  a  need  to 
distinguish  between  sharp  and  smooth  occlusion  boundaries, 
based  on  the  curvature  of  the  occluding  surface.  The  intersec¬ 
tion  of  the  surface  of  the  object  and  a  plane  passing  through 
the  line  of  sight  to  the  generating  point  and  the  surface  normal 
at  the  generating  point  defines  a  cross  section  contour.  The 
cross  section  contour  and  the  generating  contour  cross  at  right 
angles  at  the  generating  point.  Sharp  boundaries  occur  when 
the  curvature  of  the  cross  section  contour  at  a  generating  point 
is  infinite.  Smooth  boundaries  occur  when  the  curvature  is 
finite. 

Sharp  generating  contours  will  usually  remain  fixed  on  the 
object  surface  over  time.  (Exceptions  occur  only  in  the  infre¬ 
quent  situations  in  which,  due  to  changes  in  the  line  of  sight 
with  respect  to  the  object,  either  sharp  boundary  becomes 
smooth  or  a  flat  face  on  one  side  of  the  generating  point  lines 
up  with  the  line  of  sight.)  Smooth  generating  contours  will 
move  along  the  surface  of  the  object  any  time  the  surface  ori¬ 
entation  at  a  point  fixed  to  the  surface  near  the  extremal  bound¬ 
ary  is  changing  with  respect  to  the  line  of  sight.  Fig.  4  shows 
examples  of  both  possibilities.  The  figure  shows  a  view  from 
above,  with  the  sensor  looking  in  the  plane  of  the  page  and  the 
objects  rotating  around  an  axis  perpendicular  to  the  line  of 
sight.  In  Fig.  4(a),  an  object  with  a  square  cross  section  is  being 
rotated.  Fig.  4(b)  shows  an  object  with  a  circular  cross  section. 

For  sharp  boundaries,  a  surface  point  close  to  a  generating 
point  in  three-space  projects  onto  the  image  at  a  location  close 
to  the  image  of  the  generating  point.  The  surface  point  and 
the  generating  point  move  as  a  rigid  body.  For  rigid  body  mo¬ 
tion,  differences  in  flow  between  the  image  of  two  points  go 
to  zero  as  the  points  become  coincident  in  three-space.  As  a 
result,  surface  point,  arbitrarily  close  to  the  generating  point 
project  to  the  same  flow  values  as  the  generating  point  itself. 

For  smooth  boundaries,  the  situation  is  more  complex.  The 
surface  points  corresponding  to  the  boundary  may  change  over 
time,  so  that  points  on  the  surface  near  the  generating  point 
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Fig  4.  (a)  Generating  contour  at  a  sharp  boundary  remains  fixed  lo  the 
object  surface,  (b)  Generating  contour  at  a  smooth  boundary  moves 
relative  to  the  object  surface. 

and  the  generating  point  itself  may  not  maintain  a  fixed  rela¬ 
tionship  in  three-space.  The  property  described  in  (2)  still 
holds  for  rotation  in  depth,  however.  The  formal  proof  of 
this  assertion  is  relatively  complex  and  is  given  in  Appendix  B. 
(The  Appendix  actually  shows  that  the  limit  of  surface  flow  is 
equal  to  boundary  flow  for  rotation  of  smooth  objects  around 
an  arbitrarily  oriented  axis.)  Informally,  the  result  holds  be¬ 
cause  the  surface  is  tangent  to  the  line  of  sight  at  the  generating 
point,  so  that  any  motion  of  the  generating  point  with  respect 
to  a  point  fixed  to  the  surface  is  along  the  line  of  sight.  The 
difference  between  the  motion  of  the  surface  near  the  generat¬ 
ing  point  and  the  motion  of  the  generating  point  itself  is  a  vec¬ 
tor  parallel  to  the  line  of  sight  and,  hence,  does  not  appear  in 
the  projected  flow.  This  means  that  the  motion  of  the  bound¬ 
ary  in  the  x  direction  will  be  the  same  as  that  of  a  point  fixed 
to  the  surface  at  the  instantaneous  location  of  the  generating 
point.  The  limit  property  holds  because  the  surface  flow  varies 
continuously  with  x  in  the  vicinity  of  the  generating  point,  as 
long  as  we  restrict  our  attention  to  points  that  are  part  of  the 
same  object. 

To  develop  an  algorithm  for  actually  identifying  the  occlud¬ 
ing  surface  at  a  detected  boundary,  we  will  start  by  assuming 
only  translational  motion  is  occurring.  (Violations  of  this 
assumption  are  discussed  below.)  According  to  (2),  we  need 
only  look  at  the  flow  at  the  edge  point  and  immediately  to 
either  side  to  determine  which  side  corresponds  to  the  occlud¬ 
ing  surface.  In  practice,  however,  this  in  inadequate.  Edges 
will  be  located  imprecisely  in  each  frame  due  to  a  variety  of 
effects.  This  imprecision  is  compounded  when  the  location  of 
edge  points  is  compated  across  frames  to  determine  the  flow 
of  the  edge.  By  considering  the  pattern  of  change  in  the 
Laplacian  of  the  optical  flow  field,  however,  a  simple  binary 
decision  test  can  be  constructed  to  determine  which  surface 
velocity  most  closely  matches  that  of  the  edge.  As  before,  we 
will  use  a  coordinate  system  with  its  origin  at  the  location  of 
some  particular  boundary  point  at  a  time  r0,  the  x  axis  oriented 
normal  to  the  orientation  of  the  boundary,  and  consider  only 
flowx,  the  projection  of  flow  onto  the  x  axis.  In  this  new  co¬ 
ordinate  system,  positive  velocity  values  will  correspond  to 
motion  to  the  right.  We  will  assume  that  the  flow  field  in  the 
vicinity  of  the  edge  can  be  approximated  by  a  step  function. 
The  algorithm  developed  here  is  unaffected  by  constants  added 
to  the  flow  field  or  by  applying  positive  multiples  to  the  mag¬ 
nitude  of  flow .  Therefore,  to  simplify  analysis,  normalize  the 
flow  field  by  subtracting  a  constant  value  fa  such  that  the  pro- 
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]l  n’  -J  stjl<n\,  Ja) ).  The  resulting  step  edge- car.  have  one 
of  two  possible  shapes,  depending  upon  whether  the  surface  to 
the  left  is.  after  scaling  and  normalizing,  moving  to  the  left  or 
to  the  tight  (see  Fig.  5). 

Wliei.  the  two  possible  velocity  functions  are  convolved  w  ith 
a  Gjttssian  bleu  ring  kernel,  the  resulting  functions  are  shown 
in  Fig  5(a)  and  (b).  The  Laplacian  of  these  I  unctions  in  the 
direction  perpendicular  to  the  edge  is  equal  to  tin-  second  deriva¬ 
tive.  and  is  shown  in  Fig.  5(c)  and  (d)  These  two  case;,  may 
be  descnbed  analytically  as  follows. 

Ca Si  1  Given  the  step  function 
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convolve  s(.v )  w  ith  a  Gaussian  blurring  function  g(x ). 


hi  \ )  -  g  -  . 

Let  s(  x)  =  -  2n(.v )  +  1  w  here 


0.  .v  <  0 
I.  .r  >  0. 
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I •  f-X  I  =  — — 7==-  <  ”  ‘  /‘°  .  (0) 
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Therefore, 

/r’’(.v)<0  when  x  <  0 

//'(jc|>0  when  x>0.  (7) 

Can  2  The  step  function  for  case  2  is  -s(.v).where  s(.v)and 
t/(.v)  are  defined  above 

h'\x)  =  -  ,-~7=r-  (9) 


Tnercfore, 

h'\x)>  0  when  .r<0 

h'( x)<0  when  jt>0.  (10) 

At  some  later  time  /,.  the  entire  second  derivative  curve 
h"(x)  will  have  shifted  right  or  left,  depending  upon  whether 
the  edge  moves  with  the  surface  moving  to  the  right  or  left. 
Based  upon  the  analysis  above,  in  case  ],  if  the  left  surface 
is  occluding,  the  second  derivative  curve  will  be  moving  to  the 
right  and  the  sign  at  the  origin  will  become  negative,  while  if 
the  right  surface  is  occluding,  the  curve  will  be  moving  left  and 
the  sign  at  the  origin  will  be  positive.  In  cast  2,  if  the  left  sur¬ 
face  is  occluding,  the  curve  will  be  moving  to  the  left  and  the 
sign  at  the  origin  will  be  negative,  while  if  the  right  surface  is 
occluding,  the  curve  will  be  moving  to  the  right  and  the  sign 
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at  the  origin  will  be  positive.  Note  that  in  both  cases,  when 
the  left  surface  is  the  occluding  su-face,  the  sign  at  the  origin 
will  become  negative,  and  when  the  right  suifa.e  is  occluding, 
the  sign  at  the  origin  will  become  positive.  Tins  is  illustrated 
in  Fig.  5(e)  and  (ft  In  the  original,  unrelated  c  •••rdtnatc  $vs- 
ten.,  thh  is  equivalent  to  stating  th.  i  a:  time  the  dnecin'i, 
norma!  to  the  edge  for  which  the  second  directional  derivative 
of  optica!  flow  is  positive,  evaluated  at  the  location  of  the  edge 
at  r0,  points  toward  the  occluding  surface.  (The  approach  is 
similar  to  that  used  in  [9]  to  determine  the  direction  of  mo¬ 
tion  of  an  intensity  contour.)  This  analysis  may  be  extended 
to  the  general  case  where  the  original  step  function  has  not  been 
normalized.  The  direction  of  the  second  derivative  at  r,  must 
now ,  however,  be  evaluated  at  the  point  (x0 ,  v0)  +  (r,  -  t0)fa. 
(As  j a  is  the  average  flow  of  the  surfaces  on  either  side  of  the 
boundary,  this  point  max  be  thought  of  as  lying  half-way  be¬ 
tween  the  two  possible  image  locations  of  the  boundary  at 
time  r, .) 

In  practice,  difficulties  may  arise  for  very  large  differential 
flow  s  between  the  two  surfaces.  The  second  derivative  function 
h  (a )  approaches  zero  away  from  the  zero  crossing.  Noise  sen¬ 
sitivity  of  the  classification  technique  is  likely  to  increase  when 
the  value  is  small.  It  is  useful  to  determine  a  guideline  for  the 
size  of  the  Gaussian  blurring  kernel  to  ensure  that  the  curve 
will  be  observed  near  its  extrema,  where  the  sign  is  more  likely 
to  be  correct.  The  form  of  the  function  h"(x)  may  be  simplified 
by  substitution  for  analysis  purposes.  Let 
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Then,  in  case  1 , 
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The  extrema  of  /(/>)  will  occur  at  b  =  ±  1  /\/T  .and  the  extrema 
of  h  "( v )  oci  tn  at  x  -  i  a.  The  ratio 


indicates  that  at  ±2.7o  the  magnitude  of  h"(x)  is  12  percent  of 
its  magnitude  at  the  extrema,  and  thus  is  relative!)  close  to 
zero.  If  the  noise  is  such  that  the  sign  will  be  accurate  when 
the  expected  Laplacian  value  is  at  least  10  percent  of  the  ex¬ 
trema  value,  then  a  Gaussian  blurring  kernel  should  be  used  of 
standard  deviation  at  least  1/2.7  of  the  maximum  expected 
magnitude  of  flow  of  the  edge.  For  cases  where  the  noise  pre¬ 
sents  more  of  a  problem,  a  Gaussian  of  larger  standard  devia¬ 
tion  should  be  used.  The  analysis  for  case  2  can  be  performed 
similarly  with  the  same  result. 

The  algorithm  is  implemented  as  follows.  Optical  flow  fields 
are  obtained  for  two  temporally  adjacent  image  pairs.  Approx¬ 
imation  to  the  Laplacians  of  Gaussian  blurred  versions  of  these 
flow  fields  arc  calculated  by  computing  the  difference  of  the 
flow  fields  convolved  with  two  different  Gaussian  kernels. 
(Again,  the  componentwise  Laplacian  is  used.)  As  before, edge 
points  art  found  in  the  first  flow  field  by  searching  for  vector 
reversals  in  the  Laplacian  of  the  field.  At  such  points,  the  value 
of  the  smoothed  flow  field  obtained  front  the  larger  of  the 
Gaussian  kernels  is  considered  to  approximate  the  average  flow 
of  the  two  surface  regions  on  eithei  side  of  the  edge.  This 
average  flow  is  used  to  find  the  appiopnate  offset  to  add  to 
the  edge  location  to  find  P.  a  point  midway  between  the  two 
possible  edge  locations  in  the  second  Laplacian  field  Next, 
the  direction  perpendicular  to  the  edge  poirt  is  estimated  tn 
finding  the  direction  of  greatest  change  in  the  Laplacian  of  the 
first  flow  field.  The  Laplacian  of  the  second  flow  field  at  the 
point  P  is  then  examined.  The  Laplacian  component  in  the 
second  field  perpendicular  to  the  edge  orientation  points  to¬ 
ward  the  occluding  surface. 

An  example  of  this  technique  applied  to  an  image  sequence 
is  shown  in  Fig.  6.  The  leopard  translates  from  left  to  right 
approximate^  equally  between  frames  1 , 2,  and  3  in  Fig.  6(a). 
The  edge  points  shown  in  Fig.  6(b)  are  obtained  as  described 
in  Section  II.  At  each  edge  point,  an  offset  based  on  the  flow 
vectoi  from  the  smoother  version  of  the  field  at  that  point  is 
added  to  the  location  of  the  edge  point .  The  resulting  location 
is  examined  in  the  Laplacian  of  the  second  flow  field.  The 
component  of  this  Laplacian  perpendicular  to  the  edge  will 
point  toward  the  occluding  surface.  Shown  in  Fig .  6(c)  are  the 
edge  points,  each  of  which  has  an  associated  line  segment.  The 
line  segment  projects  in  the  direction  of  the  occluding  surface, 
as  determined  by  the  algorithm.  The  correct  classification  is 
made  for  all  except  a  few  points  at  the  bottom  of  the  edge.  In 
this  region,  several  nearby  tokens  were  matched  in  one  frame 
pair  but  not  the  other,  significantly  affecting  the  smoothed 
flow  fields  in  the  neighborhood  of  the  boundary. 

IV.  Rotational  Motion 

Rotation  in  depth  introduces  several  complexities  for  the 
analysis  of  optical  flow  at  occlusion  boundaries.  The  first  is  an 


(v) 


)  v  6.  la)  IiruL'i  sequence,  (h.l  Div tested  buundjtj  I'wrlaid  onto  first 
frame  of  sequent  (r)  Identification  v!  occluding  sutla.c.  Laclt  edge 
point  has  a  line  segment  projecting  from  n  toward  the  occluding 
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unexpected  corollary  of  (2):  in  certain  situations,  there  is  no 
discontinuity  in  flow  at  occlusion  boundaries.  This  occurs  for 
pure  rotation  in  depth  of  objects  that  are  circularly  symmetric, 
rotating  about  their  axis  of  symmetry,  and  otherwise  stationary 
with  respect  to  the  background.  In  such  cases,  the  image  of 
the  boundary  over  time  maintains  a  fixed  position  with  respect 
to  the  background.  As  a  consequence  of  (2),  the  projected  sur¬ 
face  flows  on  either  side  of  the  boundary  are  identical  and  are 
the  same  as  the  boundary  flow  itself.  Fortunately,  the  zero¬ 
crossing-based  boundary  detection  method  is  still  usually  appli¬ 
cable,  although  the  detected  location  of  the  boundary  may  be 
displaced. 

The  second  complication  involves  the  determination  of  oc¬ 
cluding  surfaces.  Rotations  in  depth  produce  a  dynamic  self- 
occlusion  the  rotating  object  occludes  sections  of  itself  over 
time.  In  the  situation  described  in  the  previous  paragraph.self- 
occlusion  is  the  only  dynamic  occlusion  occurring.  In  these 
circumstances,  the  relationship  in  (2)  is  of  no  direct  value  in 
identifying  the  occluding  surface.  No  information  is  available 
on  which  side  of  the  boundary  corresponds  to  a  true  occluding 
surface.  (The  situation  is  truly  ambiguous  in  that  two  very  dif¬ 
ferent  classes  of  spatial  oigamzations  can  produce  the  same  flow 
pattern.)  If  the  rotating  object  is  also  translating  relative  to  the 
background,  if  the  object  is  not  rotationally  symmetric,  or  if  it 
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O  nut  tolalinp  around  an  axis  of  symmcin  ,  then  (’)  will,  in 
principle,  correctly  identify  the  occluding  surface  Difficulties 
arise  in  practice,  however,  because  the  algorithm  given  above 
depends  on  surface  flow  in  the  neighborhood  of  the  boundary, 
not  just  at  the  edge.  In  the  presence  of  rotation  in  depth,  mis 
classifications  are  possible,  particularly  if  no  translation  relative 
to  the  background  is  occurring  and/or  the  rotating  object  is 
small,  leading  to  rapidly  changing  flow  values  near  the  extremal 
boundary. 

Rotation  also  complicates  inferences  about  relative  depth 
based  on  the  analysis  of  occlusion  boundaries.  For  translational 
motion,  the  occluding  surface  on  one  side  of  a  boundary  is  nec¬ 
essarily  in  front  of  the  occluded  surface.  For  rotation  in  depth, 
the  occluded  and  occluding  surfaces  are  on  the  same  side  of 
the  boundary,  and  no  definitive  information  is  available  about 
the  surface  on  the  other  side  of  the  boundary.  (Reference 
[10]  shows  an  example  in  which  a  nonrotating  surface  on 
one  side  of  a  boundary  is  in  front  of  a  rotating  surface  on  the 
other  side  of  the  boundary.)  One  approach  to  determining  the 
actual  relative  depth  involves  first  determining  whether  or  not 
rotation  in  depth  is  actually  occurring.  Such  as  analysis  is  be¬ 
yond  the  scope  of  this  paper  (see  [11]).  As  an  alternative,  an 
analysis  of  surface  regions  that  are  appearing  or  disappearing 
due  to  dynamic  occlusion  gives  information  about  the  occluded 
surfaces  at  a  boundary  [10] .  The  method  described  here  gives 
information  about  the  occluding  surface.  By  combining  the 
two  approaches,  self-occlusion  is  recognized  by  noting  a  bound¬ 
ary  where  one  side  is  marked  as  both  occluding  and  occluded. 

V.  Conclusion 

Motion-based  boundary  detection  is  sensitive  only  to  depth 
discontinuities  and/or  object  boundaries.  Thus,  unlike  inten¬ 
sity-based  edge  detection,  all  detected  edge  points  are  of  direct 
significance  to  the  interpretation  of  object  shape.  On  the  other 
hand,  significant  edges  will  not  be  detected  unless  there  is  per¬ 
ceived  motion  between  the  surfaces  on  either  side.  Motion- 
based  analysis  offers  another  significant  advantage.  In  most 
cases,  the  side  of  a  boundary  corresponding  to  the  occluding 
surface  can  be  identified.  As  we  have  shown,  this  is  possible 
for  general  motion,  not  just  for  a  sensor  moving  through  an 
otherwise  static  environment.  This  determination  is  quite  dif¬ 
ficult  using  only  static  information,  and  has  received  only  little 
attention  (e.g.,  [12]). 

Appendix  A 

The  following  is  an  analysis  of  the  appropriateness  of  using 
zero  crossings  in  the  componentwise  Laplacian  of  a  flow  field 
to  detect  contours  of  maximal  rate  of  change  in  the  flow  field. 

Theorem:  Let  V  be  a  twice  continuously  differentiable  vec¬ 
tor  field,  let  be  an  open  neighborhood  containing  the  origin 
such  that  BV/dy  is  constant  on  N,  let  L  be  the  intersection  of 
N  and  the  y  axis,  and  let  u  be  a  unit  vector.  Then  J V ^  •  m J 2 
has  an  extremum  in  the  x  direction  on  I.  if  and  only  if  ux 
(u  VK)-  VJFhas  a  zero  crossing  on  L. 

Justification  The  magnitude  of  the  directional  derivative  in 
the  u  direction  is 
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The  partial  derivative  of  this  quantity  can  be  simplified  as 
follows: 

dx  1  *  L  dx  dx3  J  1  yl  dv  dx3  J 


,  [  dl'  dri 
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since  BV/By  is  constant  on  A’.  For  the  same  reason, d3F/dv3  = 
0  and  d3 F/dx2  =  V2  V.  Therefore,  d/dx  jT  1"  ■  u]3  has  a  zero 
crossing  whenever  ux(u  ■  V I')  •  v2 V  does.  But  |7F-u|2  has 
an  extremum  in  the  x  direction  whenever  d/dx  |VF  ■  u\2  has 
a  zero  crossing.  ■ 

Whenever  the  Laplacian  V3F  has  a  zero  crossing,  so  must 
ux(u  •  VF)  •  V2F,  except  when  ux(u  •  VF)  =  0,  which  is  un¬ 
likely  because  real  edges  are  places  with  steep  gradients.  Zero 
crossings  in  the  Laplacian  will  therefore  almost  always  corre¬ 
spond  to  extrema  in  the  magnitude  of  the  directional  deriva 
tive,  with  respect  to  almost  all  directions.  It  is  possible  for  the 
magnitude  of  the  directional  denvative  to  have  an  extremum 
without  a  zero  in  the  Laplacian  because  the  component  at  right 
angles  to  the  preferred  direction  defined  by  u  ■  VF  need  not 
be  small.  If  there  is  no  variation  of  the  field  parallel  to  the 
edge,  then  the  steepest  directional  derivative  occurs  in  the  direc¬ 
tion  normal  to  the  edge ,  and  if  the  variation  parallel  to  the  edge 
is  much  less  than  that  norma)  to  the  edge,  as  we  expect  for 
most  images,  then  the  steepest  directional  derivative  occurs  in 
a  direction  nearly  norma!  to  the  edge.  If  we  choose  u  in  thex 
direction,  then  u  ■  VF  will  be  parallel  to  dl’/Bx,  so  that  the 
above  theorem  states  the  component  of  the  Laplacian  in  the 
direction  parallel  to  the  difference  in  the  flow  on  both  sides 
of  the  boundary  will  have  a  zero  crossing.  The  Laplacian  can 
fail  to  have  a  direction  reversal  at  an  edge  only  if  the  compo 
nent  of  the  Laplacian  at  right  angles  to  the  flow  difference  is 
not  small,  which  occurs  when  the  normal  component  of  the 
flow  gradient  at  an  edge  is  changing  in  direction  more  rapidly 


jVF  uj3  =(VF,  uf  +  (Wy  ■  u)1 
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than  ji  is  changing  in  niagniiudc .  Such  siiualions  do  noi  appeal 
lo  be  common  in  real  optical  flows,  and  can  occui  only  when 
the  unfiltered  flow  is  changing  appreciably  in  a  neighborhood 
of  the  edge  for  at  least  one  of  the  two  surfaces.  For  the  case 
of  a  boundary  between  two  surfaces  with  distinct  uniform 
flows  on  each  surface,  the  smoothed  Laplacian  has  a  directional 
zero  crossing  in  all  directions  except  along  the  boundary.  In 
that  direction,  the  value  of  the  smoothed  Laplacian  is  zero. 

The  extremum  can  be  either  a  maximum  or  a  minimum.  A 
maximum  is  of  course  desired,  and  minima  are  discarded  in 
practice  by  requiring  the  slope  of  the  zero  crossing  to  be  suffi¬ 
ciently  steep.  While  this  is  not  a  guaranteed  test,  it  works  in 
almost  all  cases  because  of  the  Gaussian  filtering  applied  to 
the  images  before  the  Laplacian  is  calculated.  Minima  in  the 
gradient  usually  correspond  to  areas  where  the  field  is  uniform, 
and  due  to  the  tails  on  a  Gaussian  curve,  gradients  near  the 
minima  tend  to  be  small,  with  small  values  for  derivatives  of 
all  orders. 

Appendix  B 

This  Appendix  contains  the  analysis  showing  that  the  limit 
of  surface  flow  is  equal  to  boundary  flow  for  the  rotation  of 
smooth  objects  for  orthographic  projections.  Any  motion  of  a 
rigid  body  can  be  described  by  giving  the  trajectory  of  an  arbi¬ 
trary  point  attached  to  the  body  and  the  instantaneous  rotation 
about  some  axis  passing  through  that  point.  Define  a  set  of 
Cartesian  axes  (X,  Y,  Z)  with  the  origin  at  the  distinguished 
point  on  the  body  and  with  the  Z  axis  along  the  axis  of  rota¬ 
tion,  and  let  (r,  0,  0)  be  spherical  coordinates  with  respect  to 
these  axes.  Let  the  orientations  of  the  axes  ( X ,  Y,  Z)  be  fixed 
with  respect  to  the  axes  (x,  v,z)of  the  image  plane  coordinates, 
so  that  the  angular  velocity  of  an  arbitrary  rotation  is  the  same 
in  both  coordinate  systems.  Let  the  surface  of  the  body  be 
described  by 

r=R{B-  0(f),  0)  (23) 

where  0(0)  =  0,  so  that  r  =  R(6,0)  at  time  f  =  0.  The  param¬ 
eter  a  =  6  -  0(f)  is  the  longitudinal  angle  of  a  point  fixed  to 
the  surface  at  f  =  0,  and  points  with  constant  values  of  a  and 
0  rotate  along  with  the  surface.  Since  6  =a  +  0(f),  u  =  d4//dt 
gives  the  angular  velocity  of  the  object  about  the  Z  axis. 

At  some  particular  instant  of  time,  let  Gbe  a  generating  point 
(rg ,  6t ,  0g)  and  n  be  the  unit  surface  normal  at  G.  Since  G  is  a 
generating  point  and  orthographic  projection  is  involved, n  will 
be  parallel  to  the  image  plane.  The  normal  component  of  the 
flow  for  an  arbitrary  point  p  =  (r,  6,  0)  fixed  to  the  surface  is 
as  follows: 

/*(/>)=  (ft  *#>)« 

=  i oR(6  -  0,0)  sin  0(-»A  sin  0  +  »j-  cos  0}  (24) 

where  ft  is  the  vector  angular  velocity  of  magnitude  u>  and 
oriented  along  the  Z  axis.  The  orientation  of  ft  and  n  may  be 
changing,  but  the  analysis  below  is  based  on  the  instantaneous 
values  of  both  quantities  at  some  particular  point  in  time . 

The  x  axis  in  the  image  plane  is  oriented  parallel  to  the  con¬ 
stant  unit  vector  n.  Since  we  are  working  with  an  orthographic 
projection,  the  x  coordinate  of  the  point  p  is  as  follows: 

x  =  p  n  =  /?(0  -  0,0)  |p(0,0)  n]  (25) 


| p(0,  0)  ■  n]  =  nx  sin  0  cos  0  +  ny  sin  0  sin  0  +  «/  cos  0 


where  p  is  the  unit  vector  parallel  lop.  Since  the  generating 
point  is  on  the  extremal  boundary  of  the  object,  x  must  have 
an  extremum  at  the  generating  point  for  variations  in  both  0 
and  0.  This  leads  to 

bx  dR{6  -  0, 0) 

=  - [p(B^n] 
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for  0  =  6g,  0  =  0g  .  Let  xg  denote  the  x  coordinate  of  the  gen 
erating  point.  From  (25),  the  flow  of  the  boundary  is  jv 
follows: 
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evaluated  at  6  =  6gf  <P  ~  4>g-  From  (27),  (28),  and  (26)  we  get 

.  <*0  3K(0  -  0,0)  ,  ,a  ^  , 

*  V - Vo — 


jp(0,0)  n] 


=  ^/?^lp(e’0)nl  (-" 

=  w R(0g  -  0,0^) sin  0[-nx  sin  6g  +  ny  cos  0^ ]  (32) 

=  /x(0,0)  (33) 

using  (24)  and  d\p/dt  =  u.  This  establishes  (2)  for  arbitrary 
orientations  of  the  axis  of  rotation  with  respect  to  the  image 
plane,  assuming  an  orthographic  projection. 
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Analysis  of  Accretion  and  Deletion  at  Boundaries 

in  Dynamic  Scenes 
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Abstract -In  dynamic  scenes,  the  piesence  of  object  boundaries  is 
often  signaled  by  the  appearance  or  disappearance  of  occluded  surfaces 
over  time.  Such  regions  of  surface  accretion  or  deletion  can  be  found 
using  matching  techniques  similar  to  those  used  to  determine  optical 
flow  in  an  image  sequence.  Regions  in  one  frame  that  are  not  ade¬ 
quately  matched  by  any  region  in  previous  frames  correspond  to  accre¬ 
tion.  Regions  that  have  no  matches  in  subsequent  frames  correspond  to 
deletion.  In  either  case,  an  occlusion  boundary  is  present.  Further¬ 
more,  by  associating  accretion  or  deletion  regions  with  a  surface  on  one 
side  of  a  boundary,  it  is  possible  to  determine  which  side  of  the  bound¬ 
ary  is  being  occluded.  This  association  can  be  based  purely  on  visual 
motion -the  accretion  or  deletion  region  moves  with  the  same  image 
velocity  as  the  remaining  visible  surface  to  which  it  is  attached. 

Index  Terms- Dynamic  scene  analysis,  edge  detection,  occlusion,  op¬ 
tical  flow,  segmentation. 


1.  Introduction 

OCATING  object  boundaries  in  images  is  an  important 
but  difficult  problem.  Intensity-based  edge  detection 
provides  ambiguous  or  misleading  boundary  information  in 
many  situations,  such  as  textured  regions.  Motion-based  tech¬ 
niques  can  provide  more  reliable  results  in  these  cases.  At  ob¬ 
ject  boundaries  where  occlusion  occurs,  surface  regions  will 
typically  appear  or  disappear  over  time  when  motion  is  pres¬ 
ent.  These  regions  of  changing  visibility  may  be  used  to  in¬ 
dicate  both  object  boundaries  and  the  side  of  the  boundary 
corresponding  to  the  occluded  surface. 

At  a  typical  object  boundary,  one  surface  will  be  blocking 
the  view  of  another  more  distant  surface.  In  the  presence  of 
motion,  regions  of  the  more  distant  surface  will  often  either 
appear  or  disappear  from  view  over  time.  Such  regions  are 
called  areas  of  accretion  or  deletion,  respectively.  A  similar 
situation  arises  in  stereo  vision,  where  a  region  of  the  more 
distant  surface  near  an  occlusion  edge  will  be  visible  in  one 
image  of  the  pair  but  invisible  in  the  other  image.  Thus,  recog¬ 
nition  of  accretion/deletion  regions  is  a  means  of  locating  ob¬ 
ject  boundaries  in  image  sequences.  In  addition,  accretion  and 
deletion  regions  will  belong  to  the  occluded  surface,  providing 
sufficient  information  to  determine  which  of  the  two  surfaces 
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at  a  boundary  is  being  occluded.  To  recover  the  information 
available  from  such  regions,  it  is  necessary  to  determine  both 
how  regions  of  accretion  and  deletion  in  the  imagery  may  be 
identified,  and  what  characteristics  of  such  regions  permit 
identification  of  the  occluded  surface. 

This  paper  describes  a  scheme  to  locate  regions  of  accretion 
and  deletion,  and  to  identify  occluding  surfaces  at  a  boundary 
using  these  regions  A  technique  which  matches  image  fea¬ 
tures  in  two  frames  is  used  to  determine  feature  displacement 
on  the  image  plane.  Areas  in  the  image  with  a  high  percentage 
of  features  which  are  unmatchable  in  a  previous  or  subsequent 
image  are  identified  as  accretion  or  deletion  regions,  respec¬ 
tively.  These  regions  indicate  the  presence  of  an  occlusion 
boundary.  Since  the  accretion/delction  region  belongs  to  the 
occluded  surface,  it  will  be  displaced  on  the  image  plane  in  the 
same  fashion  as  nearby  areas  of  that  surface.  The  occluded 
surface  is  then  identified  by  determining  which  of  the  two  sur¬ 
faces  adjacent  to  the  accretion/deletion  region  displays  a  simi¬ 
lar  displacement  on  the  image  plane.  This  identification  com¬ 
bines  information  about  accretion  and  deletion  with  optical 
flow  to  produce  a  description  of  the  occlusion  boundary  more 
complete  than  any  existing  technique  based  purely  on  flow 
alone. 

11  Previous  Work 

Several  research  efforts  in  computational  vision  have  utilized 
motion  information  to  recover  object  boundaries.  The  basic 
idea  behind  most  motion-based  approaches  is  that  image  plane 
motion,  or  optical  flow,  across  the  object  surface  will  be  con¬ 
stant  or  slowly  varying,  and  discontinuities  in  flow  will  occur 
only  at  object  edges  Previous  approaches  either  search  for 
discontinuities  in  the  optical  flow,  or  group  together  regions  of 
similar  flow.  Nakayama  and  Loomis  [1]  propose  a  local, 
center-surround  operator  for  detecting  object  boundaries  in 
flow  fields.  Clocksin  |2]  shows  that  zero-crossings  will  occur 
at  edge  locations  in  the  Laplacian  of  the  magnitude  of  the 
optical  flow  field  when  an  observer  translates  through  an 
otherwise  static  environment,  Thompson  el  al.  [3],  [4)  dem¬ 
onstrate  that  the  Laplacian  is  useful  as  an  edge  detector  in  the 
more  general  case  of  unconstrained  motion.  After  obtaining 
point  velocities  by  template  matching,  Potter  (5)  groups  all 
points  with  the  same  velocity  into  single  object  regions.  Simi¬ 
larly.  Fcnncma  and  Thompson  |6)  use  the  spatial  and  tem¬ 
poral  gradients  of  intensity  to  obtain  point  velocities,  and  then 
consider  all  points  with  similar  velocities  to  be  part  of  the 
same  object.  Thompson  [7)  develops  a  grouping  scheme 
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bjstJ  upon  both  intensity  and  velocity  inhumation.  Regions 
of  both  identical  intensity  and  identical  velocity  arc  formed, 
followed  by  merging  of  adjacent  regions  based  upon  snnilan- 
ties,  or  at  leasl  lack  of  conflict,  in  intensit  >  and  velocity.  With 
the  exception  of  Clocksm's  work  |2|.  these  flow-based  tech¬ 
niques  arc  unable  to  provide  any  indication  of  the  occluded 
surface  at  an  edge. 

Accretion  and  deletion  are  fundamental  to  motion  analysis 
based  on  differencing  (8) ,  (9) .  These  techniques  subtract  one 
image  from  another  and  then  use  the  presence  of  regions  of 
significant  difference  to  infer  properties  of  object  boundaries 
and  motion.  The  approach  is  most  effective  when  a  reason¬ 
ably  homogeneous  object  is  moving  relative  to  a  homogeneous 
background  with  different  luminence.  Covering  and  uncover¬ 
ing  of  the  background  leads  to  significant  differences  between 
frames,  allowing  boundaries  to  be  located.  Analysis  of  these 
difference  regions  over  time  can  often  be  used  to  associate  the 
difference  region  with  adjacent,  nonchanging  areas  of  the 
image  sequence  and  thereby  identify  which  side  of  the  bound¬ 
ary  is  being  occluded.  This  scheme  is  intensity  based,  and 
suffers  when  intensity  contrasts  occur  that  are  not  related  to 
object  structure.  A  textured  object  which  changes  location  on 
the  image  plane,  for  example,  will  produce  many  regions  of  in¬ 
tensity  difference  which  arc  not  accretion  or  deletion  regions. 

Only  limited  experimentation  has  been  directed  at  the  role 
of  accretion  and  deletion  in  human  perception.  Kaplan  jlO] 
showed  that  patterns  of  accretion  and  deletion  in  fields  of 
moving  random  dots  provide  sufficient  information  for  the 
judgment  of  relative  depth  by  human  subjects.  In  his  stimuli, 
a  single  edge  separated  two  regions  of  random  dots,  where 
each  region  moved  coherently.  The  edge  was  implicit,  being 
the  line  along  which  accretion  and/or  deletion  occurred,  and 
thus  was  not  visible  if  all  of  the  dots  were  stationary.  Subjects 
consistently  perceived  the  more  distant  surface  to  be  the  one 
which  was  undergoing  accretion  or  deletion  at  a  greater  rate. 
This  was  true  even  when  the  implicit  edge  moved  with  a  ve¬ 
locity  different  than  the  velocity  of  points  on  either  surface. 
In  these  cases  of  inconsistent  edge  motion,  there  was  more 
ambiguity  in  the  perceptions  of  subjects,  although  the  statis¬ 
tically  significant  perception  was  that  the  surface  with  a 
greater  rate  of  accretion  or  deletion  was  more  distant.  This 
suggests  that  both  edge  velocity  and  accretion/deletion  are  im¬ 
portant  factors  in  the  perception  of  depth  at  an  edge,  but  that 
accrction/deletion  information  may  be  dominant. 

III.  DETECTING  ACCRETION/DELETION  REGIONS 

A  motion-based  scheme  for  identifying  accretion  and  dele¬ 
tion  regions  is  developed  here.  To  recover  motion  on  the 
image  plane,  corresponding  structures  in  each  frame  of  an 
image  pair  arc  located.  The  result  of  this  is  a  disparity  vector 
field,  where  each  vector  represents  the  change  in  image  plane 
location  of  a  structure.  (Disparity  is  the  discrete  representa¬ 
tion  of  optical  flow  arising  from  image  sequences  that  arc  dis¬ 
cretely  sampled  in  time.)  This  correspondence  is  accomplished 
by  token  matching  A  token  is  a  distinctive  region  in  the 
image  which  is  identified  by  some  piedefincd  local  operator. 
A  set  of  tokens  is  obtained  for  each  image  in  the  pair,  and  an 


Fig.  1.  Location  of  an  accietion/dekUon  region  relative  to  the  bound¬ 
ary  indicates  the  direction  of  the  occluded  surface  In  both  cases 
shown  above,  the  vertical  line  represents  a  boundary  and  the  shaded 
area  represents  an  accretion  or  dclctiun  region.  The  arrow  points 
toward  the  occluded  surface. 

organized  search  is  performed  to  match  tokens  from  the  first 
image  to  corresponding  tokens  in  the  second  image  using  the 
relaxation  labeling  technique  described  in  [11).  Possible 
matches  between  tokens  in  the  two  frames  are  evaluated  based 
on  two  criteria:  the  similarity  between  properties  of  the 
tokens,  and  a  surface  smoothness  measure  that  favors  matches 
with  disparities  similar  to  neighboring  tokens.  An  important 
aspect  of  this  particular  matching  technique  is  that  it  can 
determine  that  a  token  in  one  frame  is  unmatchable  if  no 
token  in  the  other  frame  satisfies  the  appropriate  matching 
criteria.  By  basing  the  analysis  on  the  motion  of  tokens  in  the 
image,  many  of  the  intensity  contrast  problems  of  a  differ¬ 
encing  system  are  circumvented. 

Regions  of  accretion  and  deletion  arc  identified  by  analyzing 
unmatchable  tokens  in  either  image.  A  token  may  not  be 
matchablc  either  because  the  token  detector  failed  to  find  the 
corresponding  structure  in  the  other  image  of  the  pair,  or 
because  the  corresponding  token  is  not  visible  in  the  other 
image.  Regions  with  a  high  ratio  of  unmatchable  tokens  to 
total  tokens  are  likely  to  be  regions  of  accretion  or  deletion. 
This  motion-based,  token-matching  approach  is  an  implemen¬ 
tation  ol  Kaplan's  model  for  detecting  such  regions  [12). 
Kaplan  argues  that  accretion  and  deletion  are  detected  in  the 
human  visual  system  by  isolating  clusters  of  elements  of  opti¬ 
cal  texture,  tracking  them  over  time,  and  responding  when 
they  change  in  some  way  that  is  not  topologically  permissible. 
Token  identification  is  equivalent  to  isolating  elements  of 
optical  texture;  token  matching  serves  the  purpose  of  tracking 
such  elements  over  time;  and  analyzing  unmatchable  tokens  is 
a  response  to  some  change  which  may  be  due  to  appearance  or 
disappearance  of  a  region. 

IV.  Identifying  Occluded  Surfaces 

Not  only  can  accrction/deletion  patterns  be  used  to  locate 
boundaries,  they  provide  information  that  allows  the  identifi¬ 
cation  of  the  side  of  the  boundary  being  occluded  Such 
information  is  beneficial  when  interpreting  dynamic  scenes 
Several  specific  approaches  arc  possible,  though  all  arc  based 
on  associating  the  accretion  or  deletion  region  with  a  surface 
on  one  side  of  the  boundary.  That  surface  is  the  one  being  oc¬ 
cluded.  One  approach  relies  upon  the  relative  location  of  the 
accrction/deletion  region  with  respect  to  the  precise  position 
of  the  image  of  the  boundary.  This  boundary  is  the  actual 
point  of  occlusion,  the  accretion'dcletion  legion  being  on  the 
same  side  of  the  boundary  as  the  occluded  surface.  Fig.  1 
illustrates  this  concept.  The  primary  difficulty  in  this  ap¬ 
proach  is  identifying  the  boundary  location  relative  to  the 
accrction/deletion  region.  In  particular,  motion-based  edge  de- 
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Fig.  2.  The  location  of  new  accretion/deletion  regions  relative  to  pre¬ 
vious  such  regions  indicates  the  direction  of  the  occluded  surface. 
The  second  accretion  region  appears  to  the  opposite  side  of  the  fust 
accretion  region  as  the  occluded  surface.  The  second  deletion  region 
occurs  to  the  same  side  of  the  fust  deletion  region  as  the  occluded 
surface. 

tection  cannot  locate  the  boundary  precisely  enough  without 
first  knowing  which  surface  is  occluded.  The  inadequacies  of 
intensity-based  edge  detectors  for  this  purpose  are  well  known, 
particularly  when  applied  to  textured  surfaces. 

An  alternative  approach  involves  identifying  the  location  of 
an  accretion  or  deletion  region  relative  to  the  location  of  such 
a  region  at  a  previous  instant  in  time  (13].  New  accretion 
regions  will  appear  to  the  side  of  previous  accretion  regions 
opposite  the  remainder  of  the  occluded  surface.  New  deletion 
regions  will  occur  on  the  same  side  of  previous  deletion  regions 
as  the  occluded  surface  (see  Fig.  2).  A  disadvantage  of  this 
approach  is  the  necessity  to  track  and  locate  whole  accretion/ 
deletion  regions  over  time. 

The  approach  for  identifying  occluded  surfaces  from  ac¬ 
cretion/deletion  regions  which  is  developed  in  this  paper  re¬ 
quires  the  recognition  of  similarities  between  such  regions  and 
one  of  the  two  surfaces  on  either  side  of  the  boundary.  Since 
the  accretion/deletion  region  belongs  to  the  occluded  surface, 
it  will  share  certain  properties  with  that  surface.  The  common 
property  could  be  intensity  or  texture,  although  the  problems 
inherent  in  most  intensity-based  analyses  make  these  alterna¬ 
tives  undesirable.  Once  again,  motion-based  properties  may  be 
more  reliable.  One  such  property  is  the  disparity  of  tokens  on 
the  image  plane.  Disparity  varies  slowly  over  the  surface  of 
almost  all  rigid  objects.  Accretion  or  deletion  tokens  will  thus 
exhibit  disparities  which  are  nearly  identical  to  nearby  token 
disparities  on  the  same  surface,  while  token  disparities  on  dif¬ 
ferent  surfaces  will  usually  vary. 

V.  Implementation 

The  system  which  was  developed  to  detect  occluded  surfaces 
from  regions  of  accretion  or  deletion  uses  token  matching  to 
obtain  disparity  vector  fields.  Unmatched  tokens  in  clusters  of 
high  density  are  classified  as  accretion  or  deletion  tokens,  de¬ 
pending  upon  whether  they  have  matches  in  subsequent  or 
previous  frame  pairs.  The  disparity  of  accretion  tokens  after 
their  appearance,  or  of  deletion  tokens  prior  to  their  disap¬ 
pearance,  is  obtained.  Nearby  tokens  which  arc  not  accretion 
or  deletion  tokens  and  which  have  known  disparities  are  iden¬ 
tified  and  are  used  to  identify  the  surface  to  which  the  accre¬ 


Fjg  3.  The  optical  flow  of  surfaces  A  and  £  is  indicated  by  the  vectors 

on  those  surfaces.  Since  neither  surface  exhibits  any  flow  perpen¬ 
dicular  to  the  edge,  there  will  be  no  accretion  or  deletion  regions. 

tion  or  deletion  tokens  belong.  Such  tokens  with  similar  dis 
parities  to  an  accretion  or  deletion  point  lie  on  the  occluded 
surface. 

Three  frames  in  an  image  sequence  are  required.  Disparity 
fields  Dl  and  D2  are  obtained  for  frames  I  and  2,  and  for 
frames  2  and  3,  respectively.  Accretion  pornts  are  not  visible 
in  frame  1,  but  do  appear  in  frames  2  and  3.  Tokens  first  ap¬ 
pearing  in  frame  2,  and  thus  having  no  associated  match  in 
frame  1,  are  noted.  If  these  tokens  have  a  match  in  frame  3, 
and  if  they  are  in  a  region  with  a  high  ratio  of  such  tokens  to 
total  tokens,  they  are  considered  to  be  points  of  accretion. 
The  disparity  of  accretion  points  is  provided  by  D2.  For  every 
accretion  point,  a  search  is  made  within  a  neighborhood  about 
the  point  location  in  frame  2.  Tokens  which  are  matched  in 
D2,  but  which  are  not  marked  as  accretion  points  are  found. 
All  of  these  tokens  which  have  disparities  similar  to  the  accre¬ 
tion  point  are  considered  as  a  cluster.  A  vector  pointing 
towards  the  center  of  the  cluster  is  assigned  to  each  accretion 
point,  and  indicates  the  direction  from  that  point  to  the  oc¬ 
cluded  surface. 

Deletion  points  are  visible  in  frames  1  and  2,  but  not  frame 
3.  Tokens  which  are  indicated  as  unmalcbable  in  frame  2  are 
noted.  If  these  tokens  have  a  match  in  frame  1  and  if  they  are 
in  a  region  with  a  high  ratio  of  such  tokens  to  total  tokens, 
they  are  considered  to  be  points  of  deletion.  The  disparity  of 
deletion  points  is  provided  by  Dl.  For  every  deletion  point, 
a  search  is  made  within  a  neighborhood  about  that  point  loca¬ 
tion  in  frame  1.  Tokens  which  are  matched  in  Dl,  but  which 
are  not  marked  as  deletion  points,  are  found.  All  of  these 
tokens  which  have  disparities  similar  to  the  deletion  point  arc 
considered  as  a  cluster.  As  before,  a  vector  in  the  direction 
of  the  center  of  the  cluster  is  assigned  to  each  deletion  point 
and  indicates  the  direction  from  that  point  to  the  occluded 
surface. 

VI.  Limitations 

This  boundary  detection  technique  requires  a  moderately 
dense  token  set,  both  to  find  accretion/deletion  regions,  and 
to  determine  image-plane  displacements.  This  means  that  the 
two  surfaces  adjacent  to  the  edge  must  be  distinctly  textured. 
In  addition,  there  must  be  some  component  of  optical  flow 
perpendicular  to  the  occlusion  boundary,  or  neither  accretion 
nor  deletion  will  occur.  In  particular,  motion  exactly  parallel 
to  the  boundary  will  produce  no  accretion  or  deletion  regions 
(sec  Fig.  3).  Perspective  viewing  of  translating  objects  in  prin¬ 
ciple  leads  to  difficulties  similar  to  those  associated  with  rota 
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I  if  4.  (a)  Overhead  view  of  a  cylinder  rotating  tou  met  -clock wise 
about  an  axis  at  C.  in  front  of  a  stationary  background  B  The 
viewer  is  at  A.  and  the  line  of  sight  is  along  the  dotted  line  (b)  The 
rotating  cylinder  seen  through  an  aperture  in  surface  B  which  is  now 
in  front  (c)  When  either  (a)  or  (b)  are  viewed  from  point  A,  ac¬ 
cretion  regions  a  will  occur  along  the  left  edge  on  the  cylinder,  dele¬ 
tion  regions  d  along  the  right  edge  While  the  cylinder  is  correctly 
identified  as  the  occluded  surface,  there  is  insufficient  information  to 
determine  the  relative  depth  between  the  cylinder  and  the  surface 
at  B 

tion  in  depth  (see  below),  as  the  perspective  effects  can  be 
locally  described  as  a  combination  of  rotation  and  scale 
change.  Fortunately,  the  practical  difficulties  caused  by  per¬ 
spective  effects  are  minimal.  When  objects  are  translating  in 
front  of  a  background,  the  size  of  accretion/deletion  regions 
due  to  translation  is  almost  always  much  greater  than  accre- 
tion/deletion  regions  that  appear  due  to  effective  rotation  of 
the  object. 

Certain  rotations  lead  to  potentially  confusing  situations 
when  analyzing  occlusion  boundaries.  Fig,  4(a)  shows  an 
overhead  view  of  a  cylinder  rotating  in  depth.  Fig.  4(c)  shows 
the  accretion/deletion  regions  that  arise  if  there  is  no  relative 
motion  between  the  cylinder  and  the  background  surface. 
The  analysis  above  assigns  the  accretion  and  deletion  regions 
to  the  cylinder.  Thus,  the  cylinder,  not  the  background  sur¬ 
face.  is  indicated  as  the  surface  being  dynamically  occluded. 
This  is  the  correct  interpretation,  as  the  rotation  in  depth 
causes  the  cylinder  to  occlude  itself  over  time.  However,  while 
the  dynamic  occlusion  is  correctly  recognized,  no  information 
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fig.  5.  Image  sequence  in  which  the  leopard  is  translating  from  left  to 

right 

is  directly  available  about  the  relative  depths  to  the  surfaces  on 
either  side  of  the  boundary.  In  fact,  it  is  possible  that  the  sur¬ 
rounding  surface  in  the  image  is  actually  in  front  of  the  cyl¬ 
inder  (Fig.  4(b)) ,  yet  generates  the  same  image  sequence. 

A  different  complication  occurs  if  the  rotating  object  is 
moving  with  respect  to  the  background  surface,  the  cross 
section  of  the  object  is  not  circular,  or  the  object  is  not  rotat¬ 
ing  about  its  axis  of  symmetry.  In  all  of  these  situations,  ac¬ 
cretion  and/or  deletion  will  be  occurring  on  both  sides  of  the 
actual  boundary.  The  method  given  above  is  still  valid  and  will 
identify  both  sides  of  the  boundary  as  occluded  surfaces.  The 
problem  again  arises  when  trying  lo  infer  relative  depth  given 
an  identification  of  the  occluded  surface.  The  determination 
of  relative  depth  at  a  dy  namic  occlusion  boundary  when  rota¬ 
tion  is  occurring  is  made  possible  by  combining  accretion/dele¬ 
tion  analysis  as  described  in  this  paper  with  an  optical  flow 
based  approach  |4).  This  second  technique  uses  the  relation¬ 
ship  between  the  flow  ol  a  boundary  and  the  surface  flows  on 
either  side  of  the  boundary  lo  identify  oedud/ng  surfaces. 
Accretion/deletion  analysis  locates  occludci/  surfaces.  When 
taken  together,  both  the  occlusion  of  one  surface  by  another 
and  the  sell-occlusion  resulting  fiom  rotation  in  depth  can  be 
recognized  and  appropriately  interpreted. 

VII  Example 

The  system  implementation  described  above  was  applied 
twice  to  the  image  sequence  shown  in  Fig  5.  first  processing 
the  sequence  in  the  order  shown,  then  in  the  reverse  order.  All 
images  had  a  resolution  of  128  X  128  There  were  approx¬ 
imately  1000  tokens  identified  in  each  image,  and  over  800  of 
these  were  matched  in  every  image  pair.  As  is  usual  with 
token  matching  systems,  the  density  of  tokens  (and  thus  dis¬ 
parity  vectors)  varied  across  the  image,  being  higher  in  areas  of 
fine  texture.  An  1 1  X  1 1  square  neighborhood,  centered  at 
the  unmatched  point,  was  used  for  computing  the  density  of 
unmatched  tokens.  This  size  was  chosen  to  be  small  enough  so 
that  most  of  the  neighborhood  fell  within  the  accrction/delc- 
tion  region,  yet  big  enough  to  contain  a  reasonable  number  of 
tokens  (usually  6  to  12).  If  80  percent  of  the  tokens  in  this 
neighborhood  were  unmatched  in  the  same  wav  as  the  point 
under  consideration,  then  the  point  was  labeled  “accretion"  or 
“deletion."  This  ratio  was  chosen  to  be  selectively  high,  and 
yet  to  allow  for  some  incorrect  matches  in  the  neighborhood, 
or  sonic  extension  of  the  neighborhood  out  of  the  accretion/ 
deletion  region.  A  31  X  31  window,  centered  at  the  accrc- 
tion/deletion  point,  was  searched  lo  find  clusters  of  similar 
disparity  vectors.  This  size  was  chosen  to  he  large  enough  to 
include  portions  of  both  surfaces  oulsidc  the  accrction/delc- 
tion  region,  yet  not  so  large  as  to  extend  be  vond  these  stir  - 
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Fig.  6.  (a)  Results  of  occluded  surface  determination  based  upon  accrc- 
tion/deletion  regions  for  the  sequence  of  three  frames  in  Fig.  5. 
Square  white  boxes  are  locations  of  accretion  or  deletion  points  in 
frame  2.  The  line  emanating  from  each  box  points  in  the  direction  of 
the  occluded  surface,  (b)  Results  of  occluded  surface  determination 
when  the  sequence  of  three  frames  in  Fig  5  is  processed  in  the  reverse 
order. 

faces.  Disparity  vectors  were  considered  “similar"  if  they 
differed  by  no  more  titan  2  pixels  in  each  of  the  x  and  y  com¬ 
ponents.  The  actual  values  of  most  of  these  parameters  will, 
in  general,  depend  upon  factors  such  as  the  resolution  of  the 
images,  the  amount  of  texture,  and  the  maximum  expected 
disparity. 

The  results  of  processing  in  the  forward  direction  arc  shown 
in  Fig.  6(a).  All  of  the  square  white  points  represent  accretion 


or  deletion  (ranic  2  tokens,  wlm h  were  iiulJuJ  in  the  ye,.  oinl 
(D2)  or  firsl  (I)))  dispaniy  field.  The  line  which  emanates 
from  eaeh  box  projects  toward  the  surface  which  the  algo 
riihm  indicates  is  being  occluded.  The  set  of  tokens  to  the 
right  of  the  leopard  are  deletion  points.  Tokens  near  the  left 
border  of  the  image  are  accretion  points,  which  appear  as  more 
of  the  leopard  moves  into  the  field  of  view.  Vectors  associated 
with  these  points  indicate  that  the  leopard  is  being  occluded 
by  the  surrounding  frame.  Except  for  six  noise  points,  all  ac¬ 
cretion  and  deletion  tokens  have  an  associated  vector  point¬ 
ing  in  the  correct  direction.  The  noise  points  are  not  in  the 
accretion  or  deletion  regions,  but  rather  occur  in  or  near  un- 
textured  regions,  or  on  the  edge  of  the  accretion/deletion  re¬ 
gions.  As  a  result,  there  are  either  no  other  tokens  in  the 
vicinity,  or  else  a  large  number  of  unmatched  tokens  in  the 
neighboring  accretion/deletion  region.  These  points  are  thus 
incorrectly  identified  as  accretion  or  deletion  points. 

Fig.  6(b)  shows  the  results  when  the  image  sequence  of  Fig. 
5  is  processed  in  the  reverse  order.  The  disparity  field  D1  is 
now  the  set  of  matches  for  frames  3  and  2,  and  D2  for  frames 
2  and  1 .  Tokens  to  the  right  of  the  leopard  are  now  accretion 
points,  and  tokens  near  the  left  border  of  the  image  are  dele¬ 
tion  points.  Once  again,  except  for  nine  noise  points,  all 
vectors  correctly  point  toward  the  occluded  surface.  The  noise 
points  are  due  to  the  same  causes  described  in  the  previous 
paragraph. 
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